CUIZ—-HoOeo O88 - 26 = “PES 


Brit. J Phil. Sci. 38 (1987), 283-299 Printed in Great Britain 


SC}, Os 


Is Genetic Epistemology Possible? oe 


by RICHARD F. KITCHENER* 


Several philosophers have questioned the possibility of a genetic epistemology, an 
epistemology concerned with the developmental transitions between successive 
states of knowledge in the individual person. Since most arguments against the 
possibility of a genetic epistemology crucially depend upon a sharp distinction 
between the genesis of an idea and its justification, I argue that current philosophy 
of science raises serious questions about the universal validity of this distinction. 
Then I discuss several senses of the genetic fallacy, indicating which sense of 
‘genesis’ is relevant to epistemology. Next I consider the objection that psychology 
is irrelevant to epistemology, and that since “‘genetic epistemology” is really psy- 
chology, “genetic epistemology” is irrelevant to a real epistemology. Finally, I take 
up the objection that nothing discovered in genetic psychology could be relevant 
to a genetic epistemology. These last two arguments are based upon what I claim 
to be'a mistaken notion of the nature of psychology. Suitably interpreted, psy- 
chology can assist genetic epistemology precisely in the way that the history of 
science assists current philosophy of science. 


Introduction | i . 
‘Historicist Philosophy of Science s “fy 
The Genetic Fallacy a Se 
The Nature of Psychology s 
Genetic Psychology and Genetic Epistemology ‘ 


i 
ia] 
1 INTRODUCTION PUSS 


Although few philosophers today would question the possibility of epis- 
temology (at least if interpreted in a non-foundationalist way), several 
philosophers (e.g., Hamlyn [1971], Siegel [1978, 1980]) have explicitly 
questioned the possibility of a genetic epistemology’ and many other philo- 
sophers, upon reflection, would also be inclined to reject such a possibility. 


Mah ON H 


* I owe considerable thanks to Jann Benson, Ken Freeman, Bernie Rollin and Ron Williams 
for their.helpful discussions concerning many of the issues discussed in this paper. I also 
wish to thank David Hamlyn, John Heil, William Lycan, Harvey Siegel and an anonymous 
reviewer for their comments and suggestions. It goes without saying that none of these 
individuals (especially Hamlyn and Siegel) necessarily agree with me. An earlier version of 
this paper was read at Colorado State University where the audience’s comments were 
beneficial. 

More recently, Hamlyn [forthcoming] has modified his earlier views on this question and 
now agrees that in some sense genetic epistemology is possible. In private correspondence 
Hamlyn suggests he has never really questioned the possibility of a genetic epistemology, 
only what it is. Be this as it may, many individuals have interpreted Hamlyn’s remarks 
(especially his [1971]) as being an attack on the very possibility of genetic epistemology. 


284 Richard F. Kitchener 


In this paper I am concerned with combating a cluster of arguments lodged 
against the possibility of a genetic epistemology and thus (indirectly) with 
showing that genetic epistemology is possible. 

I do not, of course, discuss all the objections that could be raised against 
the possibility of a genetic epistemology, but I do cover (I think) one major 
class of arguments, namely, those based upon a fact-norm distinction, one 
example of which is the genesis-justification distinction. Other types of 
arguments against the possibility of a genetic epistemology rely upon a 
sharp distinction between discovery and justification, a sharp distinction 
‘between the conceptual (or philosophical) and the empirical (or scientific), 
and/or a sharp distinction between the analytic and the synthetic. Although 
I do not explicitly discuss these other arguments, a similar case can also be 
made against them. It is now clear, for example, that the discovery ws. 
justification distinction needs to be at least re-evaluated and re-interpreted 
(see Nickles [1980a, 1980b]). Similar remarks apply to the distinction 
between the empirical (scientific) and the conceptual (philosophical). In 
fact, several recent philosophers of science have even gone so far as to claim 
that the philosophy of science is an empirical science! But even if we reject 
that more extreme view, it does seem clear that one can no longer draw a 
sharp distinction between the conceptual and the empirical, since the 
empirical realm seems to be relevant (in some sense) for evaluating the 
adequacy of a conceptual analysis. All of this applies irrespective of the 
increasingly large number of arguments advocating a naturalistic epis- 
temology (Kornblith [1985]), the very possibility of which throws the above 
distinctions into question. Finally, a sharp analytic-synthetic distinction, it 
need hardly be pointed out, is currently no longer widely accepted among 
post-Quinean philosophers. 

Although I do not show in a direct or positive way that genetic epis- 
temology is possible, I do show this in another way since, as Hamlyn 
[forthcoming] points out, to ask whether something is possible is (often at 
least) to presuppose that it is not possible and hence if I successfully disarm 
the arguments against it, this presumption will be nullified. 

Although there are several candidates for a genetic epistemology, eg., 
John Locke, James Baldwin, Ernst Cassirer, the most widely known such 
theory is that of Jean Piaget. In the remainder of this paper, therefore, I 

` will take Piaget’s genetic epistemology to be the paradigm case of a genetic 
epistemology. I do this for two reasons: first, Piaget [1950a, 1950b, 1950c, 
1957, 1967a, 1967b; Piaget & Garcia, 1983] has written more extensively 
on genetic epistemology than any other individual. (I have discussed the 
philosophical status of Piaget’s genetic epistemology in Kitchener [1986].) 
Secondly, his genetic epistemology (unlike other versions) is directly tied 
to the history and philosophy of science and hence constitutes the best 
example of the thesis for which I am arguing in this paper. 

Finally, I should point out, en passant, that several other individuals 

. (Gärdenfors [1984], Harman [1986], Harper [1977], Levi [1980], Rescher 


Is Genetic Epistemology Possible? 285 


[1982]) have discussed an epistemology similar to what I call a genetic 
epistemology, namely, an epistemology concerned with ‘belief revision’, 
‘probability kinematics’, ‘epistemic dynamics’, etc. Although several of 
their ideas are similar to a ‘genetic’ or ‘developmental epistemology’, their 
concern basically seems to be with constructing logical models underlying 
this process whereas I am more concerned with conceptualising the possi- 
bility of such a genetic epistemology. Like Toulmin [1972] and Wartofsky 
[1979], I am fundamentally concerned with the kinematics and dynamics 
of epistemic change without committing myself to any of the issues con- 
cerning the logic of such changes. 


2 ‘HISTORICIST’ PHILOSOPHY OF SCIENCE 


A genetic epistemology, as contrasted with more traditional types of epis- 
temology, would be concerned (in some sense yet to be indicated) with the 
historical development of knowledge in the individual person. Whether it would 
be concerned with something more than this, for example, whether it should 
include under its scope both the development of knowledge in the individual 
and in the history of science [Kitchener 1981, 1983b], is a question I will 
presently leave open. At the very least, however, it would include the 
growth of knowledge in the individual (and for Piaget it would also include 
the growth of scientific knowledge). Furthermore, whether a genetic epis- 
temology would be concerned with the development of beliefs as opposed, 

. say, to the development of cognitive stages, categories or epistemic states 
is not crucial for present purposes. 

At the heart of any such genetic (or better: developmental) epistemology 
would be a concern with the acquisition of knowledge and, in particular, 
with the epistemic transition from one state of knowledge in the individual to 
the next. (For reasons that will become clearer later, I wish to separate this 
conception of genetic epistemology, one which stresses epistemic transitions 
from a more radical version—naturalistic epistemology—which would 
replace epistemology by psychology.) One of the fundamental supports for 
the rejection of such an epistemology is the following widely-held principle: 


Questions about the genesis of an idea (belief, concept, theory) is one thing 
(an empirical question for psychology, sociology or history), whereas ques- 
tions about the validity and justification of an idea ts a different question (a 
normative question for logic and epistemology). 


I will call this the genests vs. justification distinction. It is one particular 
form of the fact-norm distinction and provides the underlying rationale for 
the notorious genetic fallacy. 

Given this distinction, we seem to have an inescapable dilemma for 
any genetic epistemology (Hamlyn [1971]). On the one hand, if a genetic 
epistemology confuses these two points of view and attempts to evaluat 
epistemic claims on the basis of their historical development vit would Hes 





286 Richard F. Kitchener 


be guilty of a (genetic) fallacy and, consequently, would be conceptually 
impossible. On the other hand, if these two points of view are kept distinct 
and genetic epistemology is (properly) located on the empirical (genetic) 
side, then nothing discoverable there would be philosophically relevant to 
a normative question concerning justification, which belongs on the other 
side. Once again, genetic epistemology (as opposed, say, to genetic psy- 
chology) would be impossible. Either way, however, genetic epistemology 
is impossible. What is clearly being ruled out, therefore, is the very possi- 
bility that a question about the ‘genesis’ of an idea could have some relevance 
towards evaluating its epistemic adequacy. 

Although the above argument (and its associated dilemma) is (or would 
be) widely accepted by many philosophers and is, in fact, strongly en- 
trenched in their arsenal of philosophical weapons, such an argument pre- 
sents something of a puzzle. For if it were successful, it would show not 
only that genetic epistemology is impossible but also that, mutatis mutandis, 
contemporary philosophy of science is impossible. 

According to most contemporary philosophers of science, what is charac- 
teristic of current “‘post-positivistic’’ philosophy of science—the views 
associated with Popper, Lakatos, Kuhn, Feyerabend, Toulmin, Laudan, 
Shapere, McMullin, Hesse, Harre et al., is the fact that it has taken a 
historical turn—perhaps even a ‘“‘historicist” turn—and now quite firmly 
insists not only that the history of science is relevant to the philosophy of 
science but that actual scientific practice (both historical and current) has 
some evidential role to play in assessing the adequacy of a philosophy of 
science. (Although the literature on this point is vast, a summary of the 
arguments in its favour as well as the historical reasons that have led up to 
it, can be found summarised in Suppes [1979].) 

This point has been articulated in various ways (Burian [1977], Chalmers 
[1979], Lakatos [1980], Laudan [1977, 1979], McMullin [1976, 1970, 1979], 
Musgrave [1974]), but I will mention only two. First, it has been claimed 
that a scientific theory (paradigm, research programme) is best construed 
not as a logical entity (consisting of a set of atemporal propo- 
sitions) but rather as a developmental entity, an entity whose ‘‘nature”’ 
unfolds over time in response to its changing background (including evolv- 
ing background knowledge, competing theories, new evidence, etc.). Hence, 
one must consider the developmental capacities, potential, fertility, etc., of 
a theory, and this requires an examination of its developmental history. If 
a scientific theory is the basic epistemic unit in science, then philosophy qua 
epistemology must be a historical or developmental epistemology since an 
understanding of this epistemic unit must involve an understanding of its 
historical development. 

Secondly, the epistemic evaluation and appraisal of a scientific theory 
must also be historical in nature. If we are to evaluate the adequacy of a 
scientific theory, we obviously need to know answers to questions such as: 
how much evidence at a certain time supports the theory? But we also need 


Is Genetic Epistemology Possible? 287 


to know such things as: In the face of anomalous data (“falsifications”’), 
how was the theory modified? Did it take on an ad hoc form, or was it 
modified so as to make novel predictions? , Were the historical modifications 
such that they constituted a progressive or deteriorating pattern?, Has the 
theory been fertile? And so forth. 

Notions such as ‘ad hoc’, ‘novelty’, ‘progressive’, ‘fertility’, as well as 
questions about the ‘growth’ of science in general appear to be questions 
that require a historical examination of the particular theory in question 
and an assessment of its past ‘‘track record”. Thus, “‘the career of a theory 
is, at least sometimes, more important than the formal relations between 
evidence claims and theoretical postulates at any stage of the theory’s 
history” (Burian [1977], p. 8). But to insist that the past career of a theory 
is often crucial in the epistemic evaluation of a theory is to say that epistemic 
appraisal involves historical (temporal) questions, e.g., knowing when a par- 
ticular item of evidence was available to the scientific community or the 
temporal order of predictions. But more than this, however—which would 
only show that temporal questions enter into epistemic appraisal—eptstemtic 
evaluation ts itself developmental in nature, and not merely formal-logical, i.e., 
we have to consider this entire temporal development of changing theory- 
environment relations and ask if there is any overall pattern of progress 
(novelty, growth, fertility) to be found there. According to some philo- 
sophers of science, for example, in order for scientific change to be rational, 
it must satisfy a cumulativity condition in which a new theory must (at least 
in the long run) explain as much as its predecessor did and then some. 

Concepts like ‘novelty’, ‘fertility’, ‘progress’, ‘“cumulativity’ appear to be 
epistemic criteria that are historical (or better: developmental) in nature: 
they inevitably involve developmental questions in one’s epistemic 
appraisal.’ This is one of the reasons why theories are sometimes viewed as 
developmental entities (Kitchener [1983a]): an earlier epistemic stage of a 
theory, its underlying nature, and an evolving background place (epistemic) 
constraints on possible subsequent modifications. As a theory is sub- 
sequently modified during its career, thus producing a series of stages, its 
capacity for subsequent change is modified. How a theory can be developed 
in the early stages of its career is quite different from how it can be modified 
during its waning years. Here it is appropriate to talk of its developmental 
capacities, natural paths, directions of development, etc. 

If all of this is correct—and I am here assuming without argument that 
it is correct—then one cannot so neatly bifurcate ‘‘questions of genesis” 
from ‘questions of validity”, for ‘‘historically-oriented” philosophers of 
science are claiming that the epistemic evaluation of a scientific theory is 
historical in nature and based upon the genesis and development of a 


' An additional sense in which these criteria are developmental would involve the claim that 
they themselves change (and perhaps progress) over time. For reasons of space, I do not 
discuss this feature, although it is clearly relevant to the central thesis of this paper concerning 

_ the possibility of genetic epistemology. 


288 Richard F. Kitchener 


scientific theory. To grant that one should sharply separate these two points 
of view (as the critics of genetic epistemology claim one should) and insist 
that one never ‘‘evaluate the adequacy of epistemic claims on the basis 
of their historical development” would be tantamount to claiming that 
contemporary philosophy of science is both conceptually confused and 
logically mistaken and hence that it is impossible. Of course, it may be that 
contemporary philosophy of science is conceptually impossible, since it 
does not adhere to the genesis-justification distinction but instead allows 
that facts are relevant to norms. That point, however, is yet to be shown 
even though some philosophers (e.g., Siegel [1980]) have, in effect, claimed 
this. There are at least two forms such an argument might take. (1) It might 
be argued that philosophy of science is a purely prescriptive discipline, 
prescribing what scientists ought to do or believe, e.g., ‘scientists ought 
never to accept a theory unless it is confirmed to degree k’. But aside from 
the difficulty of making sense out of such a program, it is in fact difficult to 
find anyone who has actually advocated such a purely prescriptive approach. 
Old-line logical empiricists such as Carnap, Hempel, Reichenbach and 
Feigl, for example, clearly rejected and continue to reject such a view (see, 
e.g., Feigl [1974]). (2) The second (much more plausible) line of argument 
would rather claim that philosophy of science is neither descriptive nor 
prescriptive but rather engaged in a rational reconstruction, analysis, or 
explication of scientific concepts, arguments and activity (Hempel [1979], 
Feigl [1974]). But what is abundantly clear is that the very notions of a 
‘rational reconstruction’ or ‘explication’ is in need of considerable clari- 
fication and that, even on the most generous interpretation, rational recon- 
structions and explications are constrained by historical facts, a point Hem- 
pel [1979], Feigl [1974] and others insist upon. According to Reichenbach 
[1938, p. 6], for example, a rational reconstruction must satisfy a principle 
of correspondence with actual thinking, and Carnap [1962, p. 7] claims that 
the “explicatum must be similar to the explicandum...’’. In short, actual 
scientific practice places firm constraints on an adequate philosophical 
analysis, explication or rational reconstruction. But if this is so, then phil- 
osophy of science cannot be pursued in splendid isolation from the history 
of science. : a 

If, as I am assuming, contemporary “‘historicist” philosophy of science 
is basically correct in its metaphilosophy of science, the argument for the 
conceptual impossibility of genetic epistemology cannot, at least as it now 
stands, be accepted. As a result, this shifts the burden of proof onto the 
critics of genetic epistemology, who now must show.that although the 
genesis-justification distinction does not have exceptionless validity, it does 
have validity in the context of genetic epistemology. After all, there may be 
important differences between the two areas such that what is true of the 
philosophy of science is not true of genetic epistemology. (I will consider 
one such alleged difference below.) But that would involve something 
more—an argument showing what these differences are—rather than. 


Is Genetic Epistemology Possible? 289 


merely an uncritical acceptance of the genesis-justification distinction tout 
court. In short, what I am claiming is that the general argument against 
genetic epistemology, the one based upon the assumed universal validity 
of the genesis-justification distinction, is not adequate. 


3 THE GENETIC FALLACY 


At this point the critic of genetic epistemology is likely to make the following 
response. I have misunderstood, they are likely to claim, the point of the 
genesis-justification distinction and the correlative point about the genetic 
fallacy. For what my argument from current philosophy of science shows 
is that philosophers of science are not interpreting the genesis-justification 
distinction the way it ought to be (or usually is) interpreted. When someone 
says: “One cannot evaluate the adequacy of an idea on the basis of its 
genesis’’, they mean the same thing as one would mean when it is said that 
one cannot evaluate the adequacy of an idea on the basis of the simple fact 
that people (even most people) believe it is so. As one historically-oriented 
philosopher of science (Hanson [1962], p. 581) puts it: 


That x is done universally does not in itself make x the universally correct thing to 
do. That all past and present scientists do x or say that x—or are said by historians 
to have done or to have said that x—does not in itself make x the correct thing to 
do or to say fp. 581] (see also Hanson [1967]). 


Likewise, the mere fact that something has happened in a certain temporal 
order does not in itself entail anything about its validity or justification; the 
confusion of the mere temporal order with the logical order is, however, 
the genetic fallacy. 

This point, suitably interpreted, is certainly correct and no one would 
want to dispute it. But if so, and if philosophers of science are cognizant of 
the genetic fallacy, then how (it may be asked) can it be claimed that the 
epistemic appraisal of a scientific theory is historical (genetic) in nature? Is 
not “historicist”” philosophy of science guilty, after all, of this genetic 
fallacy? 

My answer here is: “No, and neither is genetic epistemology.” The issue 
depends, in a crucial way, on what one takes the genetic fallacy to be and 
what one understands by ‘genesis’. Three types of genetic processes can be 
distinguished and, therefore, three different types of genetic fallacy need to 
be separated: a temporal sequence, a causal sequence, and a rational-logical 
sequence. 

A mere temporal (or historical) sequence consists of a set of conjunctive 
temporal events: P(t1) & Q(t2) & R(t3).... Nothing of any logical import 
seems to follow from mere temporal sequences and if one identified the 
logical order with the temporal order, one would be guilty of a genetic 
fallacy. But, as I have characterised historically-oriented philosophy of 

„science, no one is making the claim that a mere de facto historical order of 


290 Richard F. Kitchener 


scientific events is epistemologically crucial in evaluating scientific theories. 
What is lacking in a de facto temporal sequence is the notion of these events 
being related to each other in the right kind of way. An accidental temporal 
concatentation of separate and discrete events has very little, if any, epis- 
temological relevance. But this is not what is being claimed by these philo- 
sophers of science. A scientific theory, for example, is an entity which 
persists through time and is developing. But a mere temporal order need 
have no such enduring entities nor a strong underlying order to it. 

A second interpretation of the genetic fallacy takes ‘genetic’ in a stronger 
causal sense, thereby giving some underlying connection to the set of tem- 
poral events. Just as one can find several individuals who interpret the 
genetic fallacy as a fallacy involving mere temporal order, several other 
individuals take the genetic fallacy to involve a fallacy concerning causality. 
Typical here are individuals who take the genetic fallacy to be a fallacy of 
evaluating an idea on the basis of its causal source or origin, e.g., a dream or 
a drug. But this is clearly not what is meant in the philosophy of science 
(nor in genetic epistemology) by ‘genesis’, since we are concerned with a 
sequence of events and not its causal source. However, even if we take the 
sequence in question to be a causal sequence in which P(t1) causes Q(t2), 
which causes R(t3), this will not do. For, again, a mere factual causal 
sequence in itself need have no epistemic relevance for reasons similar to 
those given in regard to temporal order. A causal sequence of a very special 
kind might have such relevance, but a de facto causal sequence concerning 
related events would not typically be the kind of genetic sequence philo- 
sophers of science have in mind. For example, if a scientist’s dream leads 
him/her to construct theory T (at t1) and later (t2) a drug leads him/her to 
modify T to T’, and still later (t3) a sleepless night leads him/her to modify 
T” to T”, such a sequence would in itself have little epistemic relevance. 
What would be relevant here is not the sequence of causes per se but rather 
the sequence of theory changes (T—T’—T’”’) together with their underlying 
reasons. Here we have a sequence of theoretical modifications backed (pre- 
sumably) by reasons and likewise in genetic epistemology we will have 
cognitive stages (or beliefs) constructed for reasons. , 

In these cases we are taking ‘genesis’ in a still different sense, namely, as 
a rational or logical (conceptual) development, t.e., a sequence of epistemic 
stages rationally and epistemically related to each other. As I have already 
suggested, when a theory T is modified to T” (in order to explain a recently 
discovered anomaly) and in the process J’ makes novel predictions, we can 
both talk about the developmental stages of the theory and we can also ask 
how such stages are logically related to each other. Likewise, we can often 
(with hindsight) look back upon such stages and judge them to be pro- 
gressive and to have given us a gain in knowledge. 

Similar remarks can be made about a person’s beliefs. We can trace the 
development of these beliefs (B-——B’—B’’) and see how an individual’s 
awareness of the problem encountered by his/her belief (B) together with _ 


Is Genetic Epistemology Possible? 291 


the acquisition of new information and perhaps even different processes of 
reasoning led to a modification of B to B’. In this way, we can chart how a 
person came to hold B’, why (s)he abandoned B, how B and B’ are con- 
ceptually related to each other, etc. This ts a genetic (developmental) pro- 
cess! One can, of course, speak of the sequence as being causal in nature, 
just as long as one remembers that it is also a rational-logical and conceptual 
sequence. 

The problem with most discussions of the genetic fallacy and related 
issues (e.g., Siegel [1980]) is that they fail to distinguish two senses of ‘how 
a person comes to hold a belief—an ordinary non-cognitive causal sense 
(‘causally comes to believe’) and a cognitive-rational sense (‘rationally 
comes to believe’). Consider, for example, two ways of explaining how we 
have come to have a particular belief. One could give a non-cognitive causal 
explanation of it in terms, say, of operant reinforcement: x has belief B 
(e.g., that Einstein’s theory is false) because x has been reinforced for 
believing it by his/her anti-semitic friends. On the other hand, one could 
explain how x came to have belief B by considerations such as the following: 
x read numerous books about theories of space-time, x carefully considered 
the arguments pro and con for each theory, x acquainted him(her)self with 
all the extant evidence relevant to the issue, x discussed the issue with 
several cosmologists, etc., and because of these factors x came to believe B. 
Here we do not have merely a causal sequence of events, we also have a 
rational-epistemic sequence. In this case, we can say that how x came to 
hold his/her belief is relevant to assessing its credibility, and thus that the 
genesis of the belief is relevant to assessing its rationality. Likewise, one 
could attempt to study not only why a belief is initially held but also why 
this belief was sustained, how this belief was modified in the face of new 
evidence, how it came to be qualified and then rejected, and finally how a 
view opposed to B came to be held. Here we would be studying the genetic 
development of a belief in terms of x’s reasoning, beliefs, perceptions, new 
information, etc., and we could give an epistemic appraisal of this particular 
belief based upon this developmental sequence. 

If we take ‘genetic sequence’ in the third sense of a rational (devel- 
opmental) sequence—a sequence of cognitive stages related to each other 
in terms of reasons rather than brute causes—it becomes clear both that 
“historicist” philosophy of science is taking ‘historical sequence’ in this 
sense and that no genetic fallacy is being committed. Similarly, genetic 
epistemology is concerned with what can be called the developmental logic 
underlying the relations between successive cognitive stages, the relations 
between these cognitive stages and the epistemic object, the modifications 
of these cognitive stages as a result of their inadequacy in coping with the 
environment, etc. Such a genetic epistemology would take these genetic 
sequences (primarily) in the third sense, although it would also (on one 
interpretation) countenance the notion that these reasons are also causes, 
and that (for a full account) brute causal sequences must also be considered. 


292 Richard F. Kitchener 


The crucial point to note, however, is what such a genetic epistemology 
would be and how it would lead to a re-evaluation of the genesis-justification 
distinction (or at least one interpretation of it). 


4 THE NATURE OF PSYCHOLOGY 


I have been suggesting that genetic epistemology does not commit the 
genetic fallacy any more than contemporary philosophy of science does. 
Both fields are concerned with what I have called rational genests rather 
than with a non-cognitive causal genesis. Such rational genesis is, I have 
suggested, subject to epistemic evaluation. In short, there is no reason 
epistemology cannot contain a genetic (or developmental) dimension as 
long as ‘genetic’ is understood in this particular way. Several objections to 
the very possibility of a genetic epistemology remain to be discussed, 
however. Two in particular stand out: (1) since ‘‘genetic epistemology” is 
really just genetic psychology and since psychology is irrelevant to epis- 
temology, “‘genetic epistemology” is thus irrelevant to epistemology, and 
(2) since genetic epistemology is really a philosophical epistemology strictu 
sensu, in which factual questions are irrelevant, and since psychology is 
obviously concerned with these factual questions, genetic psychology is 
irrelevant to genetic epistemology. These two arguments will be discussed 
in this and the next section. 

According to a rather widespread view (e.g., Hamlyn [1971], Siegel 
[1980]), psychology is irrelevant to epistemology. The basis for this claim 
typically consists of two components: first, the assumption that there is a 
sharp fact~norm distinction; and second, the claim that whereas psychology 
is a purely factual science, epistemology is normative in nature. (I have 
discussed this more fully in Kitchener [forthcoming].) Since genetic epis- 
temology, it is claimed, is a branch of developmental psychology, it follows 
that “genetic epistemology” is merely factual and therefore irrelevant to 
epistemology.' The two key issues in this argument obviously concern how 
one should conceptualise the fact-norm distinction and how one should 
conceptualise the nature of psychology. Since Ihave already (indirectly) 
suggested there is no sharp categorical distinction between facts and norms 
(see also Kitchener [1980]), I will briefly take up the second point—the 
widespread claim that psychology is a purely empirical (#.e., causal) 
science. ; 

What appears to be the standard view is that psychology is an empirical 
science, and for most individuals this means that psychology is a causal 


1 'This view is wrong, however. Piaget’s genetic epistemology is not a branch of developmental 
psychology, nor is it “‘merely factual”. Although it includes an empirical, psychological part, 
genetic epistemology also contains an irreducible normative part. Thus, genetic epistemology 
cannot be reduced to genetic psychology. Furthermore, not all genetic psychology is relevant 
to genetic epistemology, rather only that part which is cognitive in nature. I have discussed 
these points more fully in Kitchener [1986]. 


Is Genetic Epistemology Possible? 293 


science exclusively. Such a view, for example, can be found among several 
current philosophers of science (Lakatos [1980], Laudan [1977], Popper 
[1979]), who otherwise are “‘historicist” in their orientation. According 
to these individuals, psychology is fundamentally irrelevant to the major 
epistemological concerns of the philosophy of science. One of the most 
influential views on this issue is that of Lakatos [1980], according to whom 
the major epistemological task of the philosophy of science is to provide 
normative principles by means of which one can then provide a rational 
reconstruction of the history of scientific knowledge and thereby give a 
“rational reconstruction of the growth of objective knowledge” [1980, 
p. 102]. These normative principles indicate, for example, what constitutes 
scientific progress, what is involved in an increase in scientific knowledge, 
what is scientific rationality, etc. An example of such a normative principle 
would be the following: theory T2 is better than Tı if T2 can explain the 
success of Tı, T2 makes new predictions, and some of these predictions 
are corroborated [p. 32]. Such a principle would allow us to show the 
history of science to be a rational affair. For example, Copernicus’ research 
programme objectively and rationally superseded Ptolemy’s because it 
“predicted a wider range of phenomena, it was corroborated by novel facts 
and...it has more heuristic unity” [p. 189]. 

Such a rational reconstruction, showing what was (or would have been) 
rational, is part of the internal history of science (t.e., intellectual history) 
and belongs to Popper’s World 3 (the quasi-Platonistic world of theories, 
abstract cultural and logical entities, mathematical systems, etc.). On the 
other hand, external history of science is the province of the empirical 
sciences of psychology and sociology and belongs to Popper’s World 2— 
the mental world. 

If internal history of science rationally explains the growth of objective 
knowledge, what then does external social-psychological history of science 
explain? Although Lakatos is not always clear about this point, he seems to 
maintain that it empirically explains the non-rational factors involved in 
the actual acceptance of the theory, what motivated scientists to accept it, 
etc. He says, for example: 


External history either provides non-rational explanation of the speed, locality, 
selectiveness, etc. of historic events as interpreted in terms of internal history, 
or, when history differs from its rational reconstruction, it provides an empirical 
explanation of why it differs. But the rational aspect of scientific growth is fully 
accounted for by one’s logic of scientific discovery [1980, p. 118]. 


This leads to the following principle of asymmetrical rationality: “‘... when 
a thinker does what is rational to do, we need inquire no further into the causes 
of his actions; whereas, when he does what is in fact irrational... we require 
some further explanation” (Laudan [1977], pp. 188-9). Thus when a scien- 
tist does what it is rational to do (on our normative criterion), this is the 


294 Richard F. Kitchener 


normal (natural) thing and our quest for an explanation is over; by citing 
what are good reasons for this action, we have fully explained it and, a 
fortiori, no further causal explanation is either needed or possible. But 
when (s)he does what is irrational, this abnormality or deviation requires a 
further, causal explanation; this will be advanced presumably by psychology 
and might involve, for example, unconscious motives, special rewards, 
religious bias, political pressure, etc. It is clear, therefore, that explanations 
in terms of reasons are, according to this account, World 3 entities, whereas 
the causal explanations of deviance advanced by psychology belong to 
World 2. 

On this account, psychology is conceived to be exclusively a causal 
science, belonging to World 2. Moreover, it is also being assumed that 
psychology can only causally explain irrational actions and deviations from 
rationality. It can not explain the very rationality of a belief, which belongs 
to World 3; it can only causally explain the irrationality of beliefs, which 
belongs to World 2. Since psychology can only causally explain deviations 
from rationality, and since philosophy of science gua epistemology is con- 
cerned with explanations of rationality, psychology is irrelevant to epis- 
temology, t.e., to the rational genesis and development of knowledge. But 
since genetic epistemology is just genetic psychology (according to this 
argument), “‘genetic epistemology” is really a misnomer and is not relevant 
to the rational genesis of knowledge. 

It seems clear that two fundamental assumptions underlying this argu- 
ment (and the general argument that psychology is irrelevant to epis- 
temology) are the following: (1) there is a sharp distinction between reasons 
and causes, and (2) psychology can only deal with causes. But neither of 
these assumptions seem to be defensible. 

Although there clearly is a difference between the concepts of a ‘reason’ 
and a mere ‘cause’, it does not follow that something can’t be both. My 
perception of my wife committing adultery may both cause me to form a 
new belief about her moral character and also be the reason supporting it. 
Cognitive states such as ‘belief’, ‘perception’, ‘memory’, etc., typically 
have precisely this hybrid character of being both causes and reasons and 
naturally leads to the suggestion that one cannot make a sharp distinction 
between the two classes of things. 

The problem is merely exacerbated when the Popperian framework of 
“three worlds” is introduced. For although it seems natural to claim that 
reasons belong to World 3 and that some psychological states (e.g., drives) 
belong to World 2, there is a whole range of other psychological concepts 
that straddle the fence. ‘Belief? is precisely one of these concepts: beliefs 
are psychological states (or acts), but they also have a propositional content 
intrinsic to them. They belong to World 2 even though their logical object 
or objective content belong to World 3. This point seems to have been 
recognised—somewhat dimly—by Lakatos [1980] when he says in a some- 
what puzzling passage: 


Js Genetic Epistemology Possible? 295 


Actual states of minds, beliefs, etc., belong to the second world; states of the 
normal mind belong to a limbo between the second and third (p. 92). ... I want to 
distinguish between psychological, plainly second-world concepts, like ‘belief’, and 
psychologistic concepts like ‘rational belief? in the sense of ‘belief of a clear mind’. 
While psychology may be defined as the theory of mind, psychologtsm is the theory 
of a ‘healthy’, ‘normal’, ‘clear’, ‘ideal’, ‘empty’, ‘purged’, ‘unbiased’, ‘objective’, 
‘rational’, or ‘scientific mind’ (p. 208).... The study of actual scientific minds 
belongs to psychology; the study of the ‘normal’ (or ‘healthy’ etc.) mind belongs to 
‘a psychologistic philosophy of science. There are two kinds of psychologistic philosophies 
of science. According to one kind there can be no philosophy of science: only a 
psychology of individual scientists. According to the other kind there is a psychology 
of the ‘scientific’, ‘ideal’ or ‘normal’ mind: this turns philosophy of science into a 
psychology of this ideal mind... (pp. 92-3). 


These comments by Lakatos lead to the second assumption mentioned 
above and suggests the need to consider psychology as something more 
than merely a causal science. This, I take it, is the thrust of Lakatos’ [1980, 
p. 15] and Popper’s [1979, pp. 114, 156) claims that psychology, as presently 
conceived, is merely a causal, World 2 science but that what is needed is 
something else: the abandonment of psychologism (the denial of World 3) 
and an appreciation of the objective nature of World 3. Whether we need 
all of the excess baggage of Popper’s three worlds or not, the point that 
remains valid is this: psychology need not be conceived to be a causal 
science and nothing more. That happens to be a very questionable legacy 
of Humean empiricism and recent positivism. Virtually all philosophers 
writing on genetic epistemology and historicist philosophy of science have 
taken psychology to be exclusively an empirical (causal) science and, more- 
over, have interpreted ‘empirical’ in the classical Humean-positivistic sense 
in which what is empirical is a set of lawful regularities between observable, 
logically independent atomistic facts, facts devoid of theoretical and nor- 
mative components, Such “‘brute facts” were thought to be causally related 
to each other, and psychology had the task of discovering them. (A related 
conception, of course, is the assumption, widely held by philosophers of 
science, that psychology deals with private, subjective mental states.) But 
this is only one conception of psychology—an empiricist one—and arguably 
inadequate. For if we take a non-empiricist conception of psychology, e.g., 
a rationalistic one (as in Chomsky [1972]) or a structuralistic one (as in 
Piaget [1950c]), we need not restrict psychology exclusively to a causal 
study of mental states nor need we interpret the concept of ‘empirical’ and 
‘cause’ in a standard Humean way. In fact, Piaget’s particular version of 
genetic epistemology quite explicitly denies that psychology studies only 
causes and insists, on the contrary, that psychology has the task of giving 
both reason explanations and causal explanations. 

According to Piaget [1950c], causal explanations basically concern the 
bodily movements present, for example, during the early stages of an 
individual’s life. Reason explanations, by contrast, concern the logical, 
intentional and semiotic relations between contents of consciousness, e.g., 


296 Richard F. Kitchener 


if I desire a goal, then (ceteris paribus) I desire the means for attaining it. 
Reason explanations increasingly come into play as the individual develops 
into a rational being and as causal explanations are replaced by reason 
explanations. Clearly, for Piaget, since psychology (at least part of it) has 
the task of explaining the growth of knowledge in the individual, it cannot 
do so merely with ordinary causal explanations. Instead it must appeal to 
reason explanations, explanations involving concepts such as ‘perceived 
dissonance’ and ‘contradiction’, ‘successful problem resolution’, ‘new cog- 
nitive strategies’, ‘more adequate cognitive structures’, ‘equilibrated, com- 
pensatory responses’, etc. But once psychology is given an explanatory role 
broader than that of providing mere causal explanations of movements, the 
objection that psychology is irrelevant to epistemology (and to philosophy 
of science) because it is merely a factual science begins to wither. 


5 GENETIC PSYCHOLOGY AND GENETIC EPISTEMOLOGY 


A final objection that needs to be considered is the following: Suppose there 
is (or could be) something called ‘genetic epistemology’, a study concerned 
with the rational genesis and development of tdeas, beltefs, theories, etc. Such 
a study might be considered to be just a part of ordinary normative epis- 
temology as traditionally practiced by philosophers, only now we would 
have a purely normative genetic epistemology. A normative genetic epis- 
temology would pursue its task independently of genetic psychology since 
genetic psychology would be a purely empirical science and, consequently, 
would have no relevance to a normative endeavour. But since advocates of 
a genetic epistemology firmly believe that genetic psychology is relevant to 
genetic epistemology, the possibility of such an empirically based genetic 
epistemology, one which looks to genetic psychology for evidence, would 
be ruled out. 

The basis of this objection appears to be the traditional belief that the 
task of any (proper) epistemology, including the philosophy of science, is 
to establish the justificatory status of an epistemic claim independently of and 
prior to factual considerations. Since, according to this view, the adequacy of 
any epistemic account can be completely decided on a priori (conceptual 
and analytic) grounds, no empirical evidence would be relevant to such an 
undertaking. Genetic psychology would thus have nothing to contribute, 
since if a genetic epistemology were possible, it would be carried out in a 
completely a priori fashion. 

It is clear, I think, that something like the following sufficiency condition 
is being presupposed: a priori eptstemic analysts is sufficient in and of ttself 
for establishing the adequacy of any proposed epistemic account. It is precisely 
this sufficiency condition that is being questioned by genetic epistemologists 
such as Piaget who claim that factual evidence is relevant for assessing the 
adequacy of genetic epistemic accounts. That is to say, although prior 
epistemic theorising may be necessary for an adequate epistemic account, 


Is Genetic Epistemology Possible? 297 


it does not follow that epistemic theorising is a sufficient condition for 
epistemic adequacy. For, in addition to such “‘prior epistemic theorizing”, 
one also needs to look to the factual realm of actual epistemic development 
in order to determine the adequacy of one’s prior epistemic theorising. 
Indeed, if contemporary philosophy of science is at all on the right path, 
then just as actual scientific practice is relevant for assessing the adequacy 
of a philosophical account of science, so actual cognitive development is 
relevant for assessing the adequacy of a philosophical account of individual 
knowledge. In both fields there must be not only “prior epistemological 
theorizing” but also what we can call “posterior epistemological testing”, 
a testing of one’s epistemological theories by reference to the factual histori- 
cal domain, whether it be the history of science or individual cognitive 
development. 

An epistemological hypothesis may be initially generated by “prior epis- 
temological theorizing” free of factual questions; it does not apper to be 
necessary to immerse oneself in factual details before coming up with an 
epistemic hypothesis. But once it has been generated, it must be evaluated 
(justified) by testing it (in some way) against actual historical fact. Other- 
wise, it remains purely a priori and has not yet proved its epistemic adequacy 
by reference to what is actually the case (epistemically speaking). If one 
wants to do philosophy of science instead of philosophy of science fiction 
this must occur, but likewise if one wants to construct a theory of the actual 
development of knowledge instead of a theory of a fictitious or ersatz 
development, a similar course must be followed. 

Assuming current philosophy of science is conceptually sound and that 
it is correct in claiming the history of science is evidentially relevant to the 
philosophy of science, one can view the above argument concerning genetic 
epistemology as thus mistaken. Of course, philosophy of science owes us 
some account of how the history of science is relevant to the philosophy of 
science and there are several such accounts available (e.g., Lakatos [1980], 
Laudan [1977]). Most of them agree, however, that philosophy of science 
is neither merely descriptive of how science is, nor purely prescriptive of 
how it ought to be. Rather, philosophy of science is conceived to be offering 
something like an ‘explication’ or ‘rational reconstruction’ of key philo- 
sophical concepts, activities and principles actually operating in science. 
The basic assumption here is that many scientific decisions have been 
relatively good ones, that there has been progress in science, and that 
scientists do, in general, reason adequately. The task of the philosopher of 
science is to give a logically perspicacious account of the implicit norms 
and “‘patterns of reasoning” embodied in the history of science. Such an 
explication must correspond to actual scierice at its paradigmatic best and 
if there were a striking dissimilarity between actual paradigmatic science 
and our logical model, our logical model would be suspect. Hence history 
of science places strong constraints on what an adequate philosophical 
explication can be, although it does not rigidly determine a unique solution. 


298 Richard F. Kitchener 


Similarly, one could make an analogous case for genetic epistemology in 
relation to genetic psychology. Genetic epistemology is concerned with 
advancing a theory and explanation of the growth of knowledge in the 
individual. Given that there has been epistemic progress, for example, how 
can one explain this progress? Such an account would be a rational 
reconstruction of the growth of knowledge in the individual, just as philo- 
sophy of science provides a rational reconstruction of the growth of knowl- 
edge in the collective realm. Our genetic epistemological model would be 
“tested” against evidence obtained from genetic psychology just as the 
history of science provides evidence for evaluating philosophy of science. 
Thus, in an important sense, genetic epistemology can be said to make 
claims about genetic psychology, claims that require checking. 

Clearly, both philosophy of science and genetic epistemology are making 
normative claims, claims about epistemic adequacy, justification, progress, 
etc. Both are also attempting to improve one’s epistemic condition and to 
facilitate even further epistemic growth. This is as it should be since (as I 
have suggested) any genetic epistemology would be normative and empirical 
(just as the philosophy of science is). The problem of course is to see how 
such an epistemology is conceptually possible, a problem that may severely 
tax our conceptual resources. Although considerably more discussion of 
this point is certainly required, I have suggested a beginning to such an 
answer may be found by taking contemporary philosophy of science as a 
model for genetic epistemology. That is only one way to show genetic 
epistemology is possible, of course, but it is a philosophically interesting 
way nonetheless. 


Colorado State University 


REFERENCES 


BURIAN, R. M. [1977]: ‘More than a Marriage of Convenience: On the Inextricability of 
History and Philosophy of Science’, Philosophy of Science, 44, pp. 1-42. 

CARNAP, R. [1962]: Logical Foundations of Probability, 2nd ed. University of Chicago Press. 

CHALMERS, A. [1979]: “Towards an Objectivist Account of Theory Change’, British Journal 
for the Philosophy of Science, 30, pp. 227-34. 

CHOMSKY, N. [1972]: Knowledge and Mind. Harcourt and Brace. 

FEIGL, H. [1974]: ‘Empiricism at Bay?’, in R. Cohen and M. Wartofsky (eds.), Methodological 
and Historical Essays in Natural and Social Sciences, pp. 1-20. D. Reidel. 

GARDENFORS, P. [1984]: “The Dynamics of Belief as a Basis for Logic, British Journal for the 
Philosophy of Science, 5, pp. 1-10. 

HAMLYN, D. [1971]: ‘Epistemology and Conceptual Development’, in Mischel, T. (ed.), 
Cognitive Development and Epistemology, pp. 3-24. Academic. 

‘HAMLYN, D. [forthcoming]: “The Possibility of a Genetic Epistemology.” 

HANSON, N. R. [1962]: “The Irrelevance of History of Science to Philosophy of Science’, 
Journal of Philosophy, 59, pp. 574-86. 

Hanson, N. R. [1967]: ‘The Genetic Fallacy Revisited’, American Philosophical Quarterly, 4, 
pp. 101-13. 

HARMAN, G. [1986]: Change in View: Principles of Reasoning. M.1.T. Press. 

HARPER, W. L. [1977]: ‘Rational Conceptual Change’, in F. Suppe and P. D. Asquith (eds.) 
PSA 1976, volume II, pp. 462-94. Philosophy of Science Association. 


Is Genetic Epistemology Possible? 299 


HEMPEL, C. [1979]: ‘Formulation and Formalization of Scientific Theories’, in F. Suppe (ed.), 
The Structure of Scientific Theories, 2nd ed., pp. 244-54. University of Illinois Press. 

KITCHENER, R. F. [1980]: ‘Genetic Epistemology, Normative Epistemology and Psycho- 
logism’, Synthese, 45, pp. 257—80. 

KITCHENER, R. F. [1981]: “The Nature and Scope of Genetic Epistemology’, Philosophy of 
Science, 48, pp. 400-15. 

KITCHENER, R. F. [1983a]: ‘Developmental Explanations’, Review of Metaphysics, 36, pp. 
791-818. 

KITCHENER, R. F. [1983b]: ‘Genetic Epistemology, History of Science and Genetic Psycho- 
logy’, Synthese, 65, pp. 3~32. 

KITCHENER, R. F. [1986]: Piagets Theory of Knowledge: Genetic Epistemology and Scientific 
Reason. Yale University Press. 

KITCHENER, R. F. [forthcoming]: ‘Is Psychology Relevant to Epistemology?’ 

KORNBLITH, H. (ed.) [1985]: Naturalizing Epistemology. M.1.T. Press. 

LAKATOS, I. [1980]: The Methodology of Scientific Research Programmes, Philosophical Papers, 
volume I (John Worrall and Gregory Currie (eds.)), Cambridge University Press. 

LAUDAN, L. [1977]: Progress and Its Problems. University of California Press. 

LAUDAN, L. [1979]: ‘Historical Methodologies: an Overview and Manifesto’, in P. Asquith 
and K. Kyburg, Jr (eds.), Current Research in Philosophy of Science, pp. 40-54. Philosophy 
of Science Association. 

Lev, I. [1980]. The Enterprise of Knowledge. M.1.T. Press. 

MCMULLIN, E. [1976]: ‘The Fertility of Theory and the Unit for Appraisal in Science’, in 
R. S. Cohen (ed.), Essays in Memory of Imre Lakatos, pp. 395-432. D. Reidel. 

McMULLI, E. [1970]: “The History and Philosophy of Science: A Taxonomy’, in R. Stewer 
(ed.), Historical and Philosophical Perspectives on Science, pp. 12—67. University of Min- 
nesota Press. 

McMuL Lin, E. [1979]: “The Ambiguity of ‘‘Historicism’’’, in P. D. Asquith and K. E. 
Kyburg, Jr (eds.) Current Research in Philosophy of Science, pp. 55-83. Philosophy of 
Science Association. 

MUSGRAVE, A. [1974]: ‘Logical versus Historical Theories of Confirmation’, British Journal 
for the Philosophy of Science, 25, pp. 1~25. 

NICKLES, T. (ed.) [1980a]: Scientific Discovery, Logic and Rationality. D. Reidel. 

NICKLEs, T. (ed.) [1980b]: Scientific Discovery: Case Studies. D. Reidel. 

PIAGET, J. [1950a]: Introduction à lépistémologie génétique. Vol. I: La pensée mathematique. 
Presses Universitaires de France. 

PIAGET, J. [1950b]: Introduction à l’épistémologie génétique. Vol. II: La pensée physique. Presses 
Universitaires de France. 

PIAGET, J. [1950c]: Introduction à Pépistémologie génétique. Vol. III: La pensée biologique, la 
pensée psychologique, la pensée sociologique. Presses Universitaires de France. 

PIAGET, J. [1957]: ‘Programme et méthodes de l’épistémologie génétique’, in E. W. Beth et 
al. (eds.), Etudes d’épistémologie génétique, Vol. I: Epistémologie génétique et recherche 
psychologique, pp. 13-84. Preases Universitaires de France. 

PIAGET, J. [1967a]: ‘L’épistémologie et ses variétés’, in J. Piaget et al. (eds.), Logique et 
connaissance scientifique, pp. 3-61. Gallimard. 

PIAGET, J. [1967b]: ‘Les méthodes de l’épistémologie’, in J. Piaget et al. (eds.), Logique et 
connaissance scientifique, pp. 62-132. 

PIAGET, J. and GARCIA, R. [1983]: Psychogenése et histoire des sciences. Flammarion. 

POPPER, K. [1979]: Objective Knowledge. Oxford University Press. 

REICHENBACH, H. [1938]: Experience and Prediction. University of Chicago Press. 

RESCHER, N. [1982]: Empirical Inquiry. Rowman and Littlefield. 

SEIGEL, H. [1978]: ‘Piaget’s Conception of Epistemology’, Educational Theory, 28, pp. 16— 


22. 

SEIGEL, H. [1980]: ‘Justification, Discovery and the Naturalizing of Epistemology’, Philosophy 
of Science, 47, pp. 297-321. 

SUPPE, F. [1979]: ‘The Search for Philosophical Understanding of Scientific Theories’, in F. 
Suppe. (ed.), The Structure of Scientific Theories, and ed, pp. 3-241. University of Illinois 
Press. 

TOULMIN, S. [1972]: Human Understanding. Princeton University Press. 

WARTOFSKY, M. [1979]: Models: Representation and Scientific Understanding. D. Reidel. 


Brit. J. Phil. Sci. 38 (1987), 301-317 Printed in Great Britain 301 


Credibility, Confirmation and 
Explanation 
by WILLIAM SEAGER 


Roughly speaking, the credibility of a theory is nothing but how believable 
it is: a theory is the more credible the more believable it is. A theory has 
high confirmation if it is well supported by the available evidence, in some 
sense of ‘well supported’ to be worked out by confirmation theory. It 
is natural to think that confirmation and credibility go hand in hand 
together—the more well confirmed a theory the more credible and vice 
versa. However natural this line of thought may be, recent work in the philo- 
sophy of science suggests that it may not be correct. The most important 
example of the splitting off of credibility from confirmation is perhaps that 
of Clark Glymour’s ‘bootstrap’ theory of hypothesis testing. 

In bootstrap testing (so called because an hypothesis from a theory is, or 
can be, tested with the aid of further hypotheses from within that very 
theory so that the theory ‘pulls itself up by its own bootstraps’) it is possible 
for a theory to be better tested than a proper part of the theory. In particular, 
Glymour shows that the observational consequences of a theory, T, may 
be less well tested or less testable, than T itself by a given body of evidence 
(see Glymour [1980], pp. 161-7 and also van Fraassen [1983a]). This is 
possible because of the non-holistic nature of bootstrap testing which per- 
mits single hypotheses to be tested and confirmed without thereby con- 
firming the entire theory. Thus there may well be a statement of or in T 
which entails all of T and which is tested by evidence E. But there may be 
no observational statement which entails all observational statements of T, 
and hence these are not tested by E (although, of course, some part of 
them will be). We cannot in this case simply take the conjunction of all 
observational statements to be this ‘default-hypothesis’ because of an 
explicit condition in the definition of bootstrapping which demands that 
each conjunct of such be tested by E, and we wouldn’t have this in the 
envisaged case. 

Of course, the idea that T can be better confirmed than a proper part of 
T wreaks havoc with the notion that confirmation is some sort of proba- 
bility-like measure. As Glymour says about this feature of his theory: ‘if 
this sounds odd to those used to thinking of confirmation and warrant in 
probabilistic terms, I think the conclusion ought to be that perhaps those 
are not the best terms in which to think about confirmation’ (Glymour 
[1980], p. 163). But such a position immediately raises the question of what 


302 William Seager 


the value of confirmation is supposed to be, if it is not a matter of increasing 
the probability, and hence the credibility of our theories. 

There are two sides to this oddness in Glymour’s theory. The one is that 
evidence which confirms an hypothesis, H, does not, ipso facto, confirm H 
& J, for arbitrary J (for particular J it may well of course), The other is 
that, in blunt terms, a more informative statement can be better confirmed 
than a less informative one, even in the case where the latter is entailed by the 
former (I will generally use the term ‘informative’ so that a less informative 
statement is entailed by a more informative one, although this does not 
accord with general usage). 

Two obvious possible conditions on the confirmation relation are akin to 
these oddities. The first is that if evidence E confirms H then £ also confirms 
H & I (i.e., something which entails what E directly confirms). The second 
is that if E confirms H and H entails J then E confirms I as well. Now, it 
has long been noticed that these two conditions, if they were accepted 
within a confirmation theory would yield the disastrous conclusion that 
anything confirms anything. For, supposing the conditions, let E confirm 
H, then E confirms H & J; but H & I entails J, hence E confirms J, even 
though J can be any statement whatsoever. 

But this bizarre result most certainly does not arise from a probabilistic 
view of confirmation. The primary intuition behind probabilistic con- 
firmation theory is summed up in the definitional schema: 


E confirms H = dfP(H|E) > P(A). 


` (This bald presentation suffers from many defects but it does suggest at 
least the flavour of the probabilistic approach. For a succinct examination 
of the difficulties that face this approach as well as a general defence of it 
see Horwich [1982].) I take some version of this definition to be the core 
of probabilistic confirmation theory. One can see immediately that the 
second of the two problematic conditions considered above is not validated 
by the probabilistic approach. For suppose that P(H & I|E) > P(H & I). It 
does not follow that P(E) > P(J); this shows that, in general, probabilistic 
confirmation does not follow upon entailment. Nor does the first mentioned 
condition hold in the probabilistic approach. It does not follow that if 
P(H\E) > P(A) then P(A & IE) > P(A & JD), for while evidently Æ tells in 
favour of H, it may tell against I. Consider the sentences ‘Reagan sends 
combat troops to Central America’ and ‘Kennedy runs in the 1988 Presi- 
dential campaign’. Belief in the further sentence ‘Gallop polls show over- 
whelming support for Reagan’s policies’ may well increase the probability 
of the first sample sentence, but it would likely decrease that of the second 
and could actually decrease the likelihood of the conjunction. This is so 
even if the evidence makes H certain, for then P(H & IJE) just equals PU|E) 
which may be much lower than P(J|H).' Needless to say, in many cases 


1 For example, let H = ‘Coin 1 lands heads’ and J = ‘Coin 2 lands tails’. Then, with normal 
background knowledge, P(H & I) = 0.25. Now, let E = ‘Both Coin 1 and Coin 2 have two 
head sides’. Then P(AIE) = 1, but P(E) = o, so that P(H & NE) = o. 


Credibility, Confirmation and Explanation 303 


conditional probabilities will behave in accordance with these two con- 
ditions, but the point here is that they will not in general so behave. 

Thus, although the probabilistic approach agrees with Glymour’s theory 
in denying these two possible conditions on confirmation it disagrees with 
this theory in most emphatically denying that a more informative state- 
ment could have higher confirmation (read: higher probability) than a less 
informative statement. This follows from elementary probability theory 
and thus is basic to the probabilistic approach. Hence Glymour’s suggestion 
that we should not look at confirmation probabilistically. However, splitting 
confirmation off from probability appears to have the distressing conse- 
quence that confirmation no longer runs in tandem with credibility. For 
surely the probability of a statement measures the likelihood of that state- 
ment’s being true. And surely it is only if there is an enhancement in this 
likelihood that the statement becomes more worthy of belief, that is, more 
credible. Whereas, in Glymour’s theory we have the less probable of two 
statements being better confirmed. 

In Glymour’s bootstrap account a whole theory (or a substantial part of 
a theory) can be tested only if there is some sentence which entails all the. 
claims of the theory (or part). That such a sentence exists is not too uncom- 
mon for entire theories, but is much more so for certain sub-parts of 
theories. Thus a ‘disconnected’ set of statements, such as the observational 
consequences of a theory are likely to form will not be tested by evidence 
which bears on only a subset of these statements. There may be ways of 
mirroring this feature of Glymour’s theory in an impure but basically 
probabilistic setting. For example, one might define confirmation so: E 
confirms H iff for all conjuncts, H* of H, P(H*|E) > P(A). On such a 
theory, a statement, T, which ‘embodied’ all of theory T*, could entail a 
conjunction s; & s & ... & s, (thought of as, say, the observational 
consequences of T*) where T could be confirmed by E even though the 
less informative conjunction would not be since one of the s;s didn’t meet 
the confirmation condition. Of course, this conjunction would remain just 
as probable as (if not more probable than) T. Following this line, one just 
cannot avoid separating confirmation and credibility. 

This feature of Glymour’s theory has been noticed by philosophers of 
course. Horwich, in his [1982] takes it to be a defect (cf. pp. 139—42).! Van 
Fraassen (with suspect motives perhaps) praises this aspect of the theory, 


1 In particular, Horwich takes exception to Glymour’s denial that P confirma P & Q, on the 
grounds that normally learning that P would enhance one’s confidence in P & Q. But we 
have seen that this is not a general feature of the probabilistic approach, and it is clear that 
this increase in confidence is based on the assumption that there is no knowledge of Q’s 
relation to P. Once we include this possibility, one cannot enunciate a general rule relating 
one’s confidence in the conjunction to the learning of one conjunct. But notice that in a way 
Glymour is demanding that confirmation follow only upon positive relevance. In cases where 
the evidence (for P or P itself) is irrelevant to Q there would be no confirmation of 
the conjunction. On the probabilistic approach such evidence would increase the probability 
of the conjunction, and thus confirm it. 


304. William Seager 


saying: ‘the correct reading is not that testing can provide us with more 
reason to believe an audacious theory than it provides for belief in the 
empirical adequacy thereof, but, instead that it can give us other sorts of 
reasons for acceptance’ ([{1983a], p. 21, my emphasis). For van Fraassen, 
acceptance means something like a commitment to talk in the language of 
the accepted theory, but certainly not to believe that it is true—instead one 
believes that it is merely empirically adequate. 

Thus we are faced with a dilemma. On the one hand, there is a natural 
desire toward, as Horwich puts it ‘the view we have been compelled to 
abandon—the correlation between confirmation and enhanced credibility’ 
({x982], p. 140). On the other hand, there is an equally natural desire to 
solve the ‘theoretician’s dilemma’ and find a rationale for preferring a theory 
to mere lists of empirical correlations. Glymour’s account fills our second 
hand, and his theory is undeniably powerful and plausible. The cost is, 
however, high for scientific realists (meaning by this term those who believe 
that science aims at the truth, therefore at belief-worthiness). If scientific 
testing favours theories which are less likely to be true, how can it be that 
we ought to believe in tested theories? 

If this problem were restricted to Glymour’s theory of confirmation then 
it might stand as a fault of just that theory but, of course, this problem is 
not so restricted. Developing a theory to account for, to explain a range of 
observed fact is generally supposed to enhance one’s epistemic position. 
One is in a better position vis-à-vis understanding the world when one has 
such a theory. If ‘having a theory’ amounts to believing a theory then 
the above considerations seriously undermine this view. How can one’s 
epistemic position be improved by having the probabilities of one’s beliefs 
decreased? Many aspects of one’s position in the world may well be 
increased by accepting a theory in van Fraassen’s sense—one can explain 
more (though the case of explanation is complex and will be considered 
more closely below), one can make more, and more accurate, predictions 
and one has more control over nature (if one’s theory is of the sort that 
yields technology). But it is difficult to see how one’s position is further 
improved by in addition actually believing the theory in question. 

It may be worth noting here that the same argument cannot be run 
against van Fraassen’s position—that in accepting a theory, one believes it 
only to be empirically adequate. It is true that the assertion that a theory 
is empirically adequate goes beyond the evidence and hence is less likely 
than this evidence. But one is in no position to use a theory, either for 
explanation, prediction or control, unless one believes that it is empirically 
adequate, but one need believe no more than this in order to use the theory. 
Why stick your neck out further than you need to? 

We are invited, then, to give up the struggle to unite the estranged 
couple and accept the divorce of confirmation from credibility. We should 
articulate complex theories involving any sort of hypotheses regarding 
hidden aspects of nature in order to enhance confirmation in the Glym- 


Credibility, Confirmation and Explanation 305 


ourian sense of the term, while restricting credibility to those aspects of the 
theory which can actually be experienced. In [1983b] van Fraassen divides 
the virtues of theories into two distinct sorts, which he calls ‘confirmational’ 
and ‘informational’. The former has to do with truth and likelihood, and 
hence corresponds to my sense of credibility; the latter with theoretical 
power, the ability of a theory to say more about the world, and hence 
corresponds to my use of ‘confirmation’. This shift in usage is unfortunate— 
to avoid confusion some regimentation is in order. Let us follow van Fraas- 
sen’s table: 


THEORETICAL VIRTUES 


(a) Confirmational (b) Informational 
such virtues raise the prob- such virtues increase the 
ability of the theory at issue power and scope of a theory 


Now when a theory is confirmed it is more likely to be true (we could 
reserve the word ‘tested’ for the sense of ‘confirmation’ used above and 
associated with Glymour’s theory). Note we can increase a theory’s infor- 
mational adequacy simply by adding a postulate to it. 

What I want to do now is pursue the consequences of splitting the virtues 
of theories in this way. Hopefully we shall find that, in the end, they must 
be brought back together. The guide in this pursuit will be the notion of 
explanation. 

It is undeniably a virtue of a theory that it can explain a wide range of 
phenomena, but what sort of virtue is this? van Fraassen maintains that it 
is just an informational virtue. For him, an explanation is basically an 
answer to a ‘why-question’ and a theory which can answer more such 
questions must have more information in it, hence must be less likely to be 
true than, say, a proper part of that theory which may not be able to answer 
a similarly wide spectrum of ‘why-questions’. To the extent that scientists 
seek out explanatorily more powerful theories within a given evidential 
context they are then seeking less probable theories. Almost in virtue of 
our own success the epistemic value of our work decreases. The disturbing 
thing about this seemingly ludicrous claim is that, from a certain point of 
view, it must be correct (and, as the above quote suggests, Glymour saw 
this difficulty). 

Let us examine van Fraassen’s theory of explanation more closely. Basi- 
cally, explanations are just answers to questions of the form ‘why P’, where 
the appropriateness of the answer is in part a matter of the context within 
which the question is raised. So, for example, if one asks, ‘why is it best to 
swim when the tide is coming in?’, various answers are possible depending 
upon the context in which the question is asked. If the context indicates 
that safety is the primary issue, then the answer could be ‘the currents are 
safer when the tide is rising since they tend to carry you toward the shore’. 


306 William Seager 


If, instead, the primary topic is comfort, the answer would be ‘the water is 
warmed as it comes in over hot sand and so swimming is more comfortable 
while the tide is rising’. Both answers appeal to theories, more or less 
directly, along with assumed background knowledge. Both answers are of 
course correct, within their own contexts. 

We have an explanation of P when we accept a theory which, in context 
c, can be used to answer ‘why P? This acceptance need only be, according 
to van Fraassen, the weak belief that the theory employed in the answer is 
empirically adequate.’ 

Let us consider a simple ‘why-question’ with an explicitly theoretical 
answer: ‘why does the time of high tide advance by about one hour every 
day? The answer might be something like this: ‘the tides are primarily 
caused by the gravitational influence of the moon and hence depend upon 
the relative position of the moon with respect to the rotating earth. If the 
moon itself did not revolve around the earth the time of high tide would be 
the same every day. But the moon does revolve around the earth, with a 
period of 28 days. Thus in one day the moon’s position relative to the earth 
alters by about 1/28th. Thus the same position is reached about 1/28th of 
a day later each day, which is to say, about an hour later’. The primary 
theoretical claim in this answer is that concerning the cause of the tides 
(which explanation might be opposed to one which invoked only the purely 
observable correlation between the position of the moon and the state of 
the tide). 

The obvious objection to van Fraassen’s view which such an answer 
suggests is this: how can anyone give the above answer as the explanation 
of tide retardation if he does not believe that the tides are caused by the 
gravitational influence of the moon, since this postulation of influence is of 
an unobservable addition to the correlation between the moon’s position 
and the state of the tide? van Fraassen considers this objection, or one very 
like it, in his [1980], pp. 151 ff. His answer is this: 
we must distinguish carefully what a theory says from what we believe when we 
accept that theory . . . The epistemic commitment involved in accepting a scientific 
theory, I have argued, is not belief that it is true but only the weaker belief that it 
is empirically adequate. ... Now, when I ask the question . . . I imply that I believe 
that this question arises. But that means only that my epistemic commitment 


indicated by, or involved in asking this question is exactly—no more and no less— 
the epistemic commitment involved in my acceptance of these theories. 


Here van Fraassen is actually considering the related problem of whether 
one must believe the presupposition’ of the relevant ‘why-question’ in order 
to sensibly ask it, but his answer also ought to apply to our problem. Note 


' van Fraassen’s full theory of ‘why-questions’ is much more detailed and powerful than this 


sketch suggests, but the above is sufficient for our purposes. 

2 This is not van Fraassen’s sense of the term, but is a convenient way to refer to what the 
why question assumes to be true in order for it to ‘get off the ground’ (thus, ‘Q’ is the 
presupposition of ‘why-Q’). van Fraassen uses the term ‘topic’ for this purpose. 


Credibility, Confirmation and Explanation 307 


that our example ‘why-question’ is not in itself theoretical—it mentions 
only observables. The problem here, which is extremely common, is that 
in giving an answer one must appeal to certain theoretical ‘facts’. 

In order to sharpen this difficulty, consider an adaptation of a technique, 
which van Fraassen uses in his [1980], for producing empirically equivalent 
theories. Let T be some theory. Then let T’ be the theory which asserts 
only that T is empirically adequate. We can say then that T = T’+M+E, 
where M is the theoretical machinery of T and E is a set of statements 
asserting the existence of the unobservables postulated by T. Further, 
consider the theory T* = T’-+M-+ —E, where —E is the negation of each 
claim in E. One whose epistemic position is belief in T we can call a scientific 
realist (with respect to T). One who believes only T’ we may call a weak 
anti-realist (with respect to T), and one who believes T* is a strong anti- 
realist (with respect to T). 

Now, probability theory tells us that T” must be at least as likely as T 
and, in fact, if E is not certain then T” will be more likely. Further, if, as 
is plausible, the existence of unobservables with the exact properties speci- 
fied by E (with the help of M) is not very likely then — E is more likely 
than £ and hence T* is also more probable than T. So goes the confirma- 
tion side of the story. 

The informational side of the story may go somewhat differently. 
Although van Fraassen eschews the notion of explanatory power (‘there 
can be no question at all of explanatory power as such’ [1980], p. 156) he 
need not. We can say that a theory T has more explanatory power than T” 
just in case for every ‘why-question’, Q, and context, c, if T” can answer Q 
in c then T can answer Q in c; and there is a Q and c where T can answer 
Q and T” cannot. This is a weak notion of explanatory power—it does not, 
for example, consider the quality of the answers the rival theories provide. 
But it will obviously suffice to compare the three theories outlined above. 

It seems clear that T has more explanatory power in this sense than either 
T” or T*. For any ‘why-question’ which does not merely request some 
empirical regularity as its answer will be unanswerable on T” (or T*) but 
may well be answerable on T. For example, the question ‘why does the 
light go on when the switch is turned?’ may, in a certain context, be asking 
for the story of what goes on in the wire after the switch is turned. The 
answer we would like to give is one involving the ‘flow of electrons’ but 
this is apparently not available to the theory whose only assertion is that 
electrical theory is empirically adequate.’ 

Thus it seems that both the weak and strong anti-realists, when faced 
with such a question, can at best say only ‘it just happens this way’. They 
pay in this way the penalty for their timidity. But they need not pay, 


1 The appearance of causation in the answers to the two example questions given is probably 
no accident, The appeal to causal structure is quite fundamental to scientific activity, as is 
convincingly argued by Salmon in his [1984]. 


308 William Seager 


urges van Fraassen. The anti-realists have simply mistaken their epistemic 
attitude or commitment towards a theory for the attitude they ought to take 
when giving explanations. If one avoids making this mistake one may give 
the very same explanation as the realist without thus becoming a realist. As 
van Fraassen says: ‘immersion in the theoretical world-picture does not 
preclude “‘bracketing”’ its ontological implications’ ([1g80], p. 81). Let us 
consider this seemingly disingenuous reply with regard to the strong anti- 
realist first. 

The strong anti-realist positively believes that the various unobservables 
referred to by T do not exist. For simplicity, let us remain with the 
electron-flow example. We suppose that in reply to the question, why does 
the light go on when the switch is flipped, the strong anti-realist replies 
that the flow of electrons through the filament of the light bulb cause it to 
heat and hence give off light. This is identical to the realist’s answer. But 
consider the further question: ‘is that the true explanation?’ Surely here 
the realist and the anti-realist must differ (or what is the difference between 
them). The anti-realist at this point must demur and admit that he was 
only speaking from the standpoint, as it were, of a certain theory which he 
in fact believes to be false. All that he really believes is that the light will 
shine when the switch is flipped (and that this is not actually caused by 
electron flow). 

The case of the weak anti-realist is not much different. In reply to the 
follow-up question she too must admit that all this talk of electrons is only 
appropriate from ‘within’ a theory to which she cannot give full credence. 
Even if our realist is in fact somewhat skeptical of his own theory his reply 
here will differ; for he will say that it is his opinion that this is the explanation, 
or that he thinks that the world operates thus and so, whereas our anti- 
realists are more assertive here—they know that they have no idea of the 
true explanation (and I suppose they also believe that only those in a 
kind of metaphysical trance could imagine that one could get the true 
explanation’). 

One must ask here whether one can offer as an explanation a putative 
fact which one in truth believes to be false or, at least, does not believe to 
be true. van Fraassen says that one may, so long as this fact is based on or 
comes from an ‘acceptable’ theory. But this point is an empty triviality if the 
definition of ‘acceptable’ includes ‘may be used as a source of explanations’. 
There is no reason why those who use scientific theories must be naive 
regarding the epistemic stance they take towards those theories. van 
Fraassen says that ‘the epistemic commitment of the discussants is not to 
be read off from their language’ ([1980], p. 152) but while this is true it 
remains that their epistemic attitude can be ascertained by directly asking 
them. If they are van Fraassian (pardon the neologism) anti-realists they 


l! See van Fraassen’s remark: ‘Realist yearnings were born among the mistaken ideals of 
traditional metaphysics’ [1980], p. 23. 


Credibility, Confirmation and Explanation 309 


may well know this (if van Fraassen’s position is correct then this would 
be an unmitigated improvement in their epistemic position). Supposing 
they are aware of their (and the correct) epistemic attitude towards their 
theories can they offer theoretical explanations in good faith? 

It seems to me that they cannot. I, for example, am aware of the empirical 
usefulness of Newtonian celestial mechanics. But I am also aware that, 
ultimately, it is a theory not worthy of belief. It seerns to follow that I cannot 
use Newtonian theory to answer serious questions about, for example, the 
motions of the planets. I can use it for prediction and control (to time the 
return of Halley’s comet, to pilot a spaceship to Mars), I can use it as a 
pedagogical device in the teaching of physics, but I cannot use it to provide 
serious answers to serious questions about why the planets move as they 
do. In good faith, I must say there is a great deal more to the story than 
what Newtonianism provides. 

It must be admitted, however, that Newtonian theory is not only 
unworthy of belief tout court but it is also unworthy of belief in its mere 
empirical adequacy. So it might be replied that of course it is unavailable 
for serious answers—it is simply not an acceptable theory. The difficulty 
with this reply is that lack of belief seems to be enough in itself to vitiate 
explanation. An example which illustrates this thesis is the early acceptance 
of Copernican teaching by the church as a means of empirical prediction. 
Since there were independent grounds for believing that the theory was 
false even if empirically adequate it would have been inappropriate to 
explain, say, the retrograde motion of superior planets via the motion of the 
earth, though it was alright to calculate this motion using the feigned 
hypothesis that the earth moved. We may fictionalise this example some- 
what by imagining that Copernican astronomy met all conditions of 
acceptability: suppose it was truly empirically adequate, powerful, unifying, 
simple, etc. It fails only with respect of being true (as we know from 
scripture). In such a case it would not be available for explanation, at least 
when serious questions were being asked, contrary to what certainly appears 
to follow from van Fraassen’s position, namely that mere lack of truth, given 
all other virtues would be irrelevant to acceptance—since the scientists are 
not seeking truth—and hence to usefulness in explanation.’ But, in fact, 
wasn’t the recognition of this feature of explanation the point behind the 
church’s strategy of ‘hypotheticalising’ Copernicus? Consider these two 
quotations, the first from the instructions for the censorship of Copernicus’ 
De Revolutionibus: 

If certain of Copernicus’ passages on the motion of the earth are not hypothetical, 


make them hypothetical; then they will not be against the truth or the Holy Wnt. 
On the contrary, in a certain sense they will be in agreement with them, on account 


' This point would appear to seldom if ever have any real significance, since truth can 
hardly be ascertained independently of general scientific ‘goodness’. In the case of early 
Copernicanism we have, however, a case where, from the point of view of the staunchly 
religious at any rate, truth can be ascertained independently. 


310 William Seager 


of the false nature of suppositions, which the study of astronomy is accustomed to 
use as its special right (Gingerich [1982], p. 138). 


The second is from a letter of Cardinal Bellarmino to Father Foscarini: 


I say that if there were a true demonstration that the sun is in the centre of the 
universe and that the sun does not go around the earth but the earth goes around 
the sun, then it would be necessary to be careful in explaining the scriptures that 
seemed contrary. ... But I do not think there is any such demonstration. .. . To 
demonstrate that the appearances are saved by assuming the sun is at the centre 
and the earth in the heavens is not the same thing as to demonstrate that in fact the 
sun is in the centre and the earth in the heavens (Gingerich [1982], p. 137). 


These passages indicate to me the recognition of the church that they could 
in a way give the astronomers the Copernican system for their use in 
calculation, prediction, etc., but by denying truth to Copernicanism would 
make it unavailable when serious talk began about the structure of the 
universe. Such serious talk would naturally involve the correct explanation 
of various phenomena. Since it wasn’t true, Copernicanism would be 
unavailable for serious explanation—there scripture could safely reign. 

The point I am belaboring is just that one can not offer as an explanation 
a proposition one does not personally believe. A scientist aware (because 
of, we might suppose, having read van Fraassen) of the proper epistemic 
stance towards his theories must cease believing these theories. I do not see 
how he could then offer a theoretical ‘fact’ stemming from one of his theories 
as a serious answer to a seriously meant ‘why-question’. If one offers ‘P’ as 
an answer to ‘why-Q’ and one is serious about giving an explanation then 
there is only one answer available to the further, reassurance seeking ques- 
tion ‘is that the true explanation’ and that answer is ‘yes’ (with the caveat, 
noted above, that one may at this point get sertous and admit some doubts 
about one’s theory while maintaining its basic truthfulness). 

The idea of being so ‘immersed’ in some theory’s ‘conceptual scheme’ 
that one simply talks in its terms without giving any thought to the correct 
epistemic stance toward the theory gains in plausibility in the sort of 
example van Fraassen considers (see above quote, p.__). There, the ques- 
tion itself brings in the theory, some theoretical ‘fact’ lies within the pre- 
supposition of the ‘why-question’. But that is by no means the only sort of 
question which can claim a scientific answer. It is easy to frame serious 
questions in non-theoretical (or relatively non-theoretical) language whose 
answers inexorably lead one into some scientific theory, or, more likely, 
into many. ‘Why are some stars a different colour than others?’ ‘Why do 
the northern lights appear so much more frequently in the far north?’ The 
list is, of course, endless. 

At this point one may raise the objection that it is common to ascribe the 
power to provide explanations to old and abandoned theories. Thus we say 
that Newton’s theory explained (or even that it does explain) the tides and 
that phlogiston theory explained why candles in closed containers go out 


Credibility, Confirmation and Explanation 311 


after a short time.’ To see what is going on here one must distinguish the 
form of an explanation from explanations themselves. The form of an 
explanation of P is an answer to the question, ‘why P’. The explanation of 
P is a (or possibly the) true (and contextually relevant) answer to this why- 
question. Old theories still yield linguistic items which have the forms of 
explanations. When we say that Newton’s theory explained the tides we 
mean that the theory gives an answer (and one with many virtues, thus 
distinguishing Newtonianism from phlogiston theory and Astrology in this 
regard) to the question ‘why are there tides’. 

The conclusion which ought to be drawn is that to give an explanation 
one must believe in that explanation. It does not follow that we ought to 
believe our theories. It may be that we ought to give up some of the 
explanations we are presently so fond of. Even the staunchest realist will 
hedge when providing an explanation which employs the most esoteric, 
newest and least well tested aspects of theories. Perhaps the correct course 
is to eschew explanation save where the statement of mere empirical cor- 
relation will suffice. In this case the proper attitude towards an accepted 
scientific theory will be just belief in its empirical adequacy. 

If we do not wish to make this retreat we must face the dilemma of 
confirmational versus informational virtues once again. Using the notation 
introduced above, any theory T can be split into parts so that 
T= T’+M+E, where, again, T’ is the assertion that T is empirically 
adequate, M represents the theoretical machinery of T and Æ is the set of 
assertions of the existence of various unobservable entities that T makes. 
It is clear that any explanation which goes beyond the assertion of some 
empirical correlation deducible from T (and hence also from T’) will make 
reference to E in some way or other. It will thus employ imformation in 
excess of that in J’, hence an increase in explanatory power would appear 
to necessitate a corresponding decrease in probability. 

This is the crux of the dilemma: generally, an increase in explanatory 
power is thought to represent an improvement in a theory,” but it seems 
we have just shown that this cannot be a confirmational improvement—not 
something which gives us more reason to believe the theory. But it also 
seems from the foregoing that without belief one cannot appeal to a theory 
for serious explanation (or, perhaps more realistically, without the expec- 
tation that at least some of our theories, maybe with some modifications, 


l Tr is interesting that one is not very tempted to say that phlogiston theory explains the 
candle phenomenon. Presumably, this results from the comparisons of the various virtues of 
Newtonianism versus phlogiston theory. 

? One must be careful here to bear in mind that it is the distinction between claiming a theory 
true and claiming it to be merely empirically adequate which is at issue. Generally as theories 
become more explanatorily powerful they also encompass a wider range of phenomena. The 
mere addition of hidden machinery to a theory is hardly an improvement. Everyone would 
agree that such reduces the overall probability of the theory (one reason why ‘simplicity’ is 
a virtue). But even in the case of ‘motivated’ increases in complexity the empirical adequacy 
claim must be more likely than the truth claim. 


312 William Seager 


will eventually be belief-worthy, we could not offer any theoretical expla- 
nations). Are we thus doomed to either relative explanatory silence or the 
steady erosion of our epistemic position? 

Perhaps we can escape without springing the trap. Consider for a moment 
the sort of probability which is assigned to theories. Without a wholesale 
subscription to a subjectivist interpretation there can nonetheless be little 
doubt that this probability is not a matter of frequencies or propensities. It 
is, rather, the degree of belief which a rational agent ought to maintain with 
regard to the theory in question. Considerations of this sort suggest that 
we employ as a model the notion of subjective probability. We can suppose 
that for each rational agent (or perhaps, in the case of science, each rational 
community), a, there is an associated ‘degree-of-belief’ function (which 
must, for familiar reasons, be a probability function), P,. A theory, T, can 
be represented by one (rather long, no doubt) statement which is assigned 
some value by P,. New evidence will shift this value according to some 
conditionalisation scheme. Since P, is a probability function, it is unques- 
tionable that adding information to T will reduce the value P, assigns to it 
no matter the evidence (save for the case where we add probabilifying 
evidence to T, but this would hardly count as a theoretical extension). 

However, the question which arises here is whether all learning of this 
sort can be modelled by some sort of conditionalisation within a single 
probability function. As an example to suggest the contrary, consider a 
lover given to jealousy, but who at the moment has almost complete trust 
in her lover. But suppose a pattern of events which are somewhat puzzling 
on the hypothesis of faithfulness but are by no means evidence of unfaith- 
fulness. However, if the jealous lover discovers a consistent story which 
explains these puzzling occurrences and which involves the unfaithfulness 
of her lover, she may suddenly take up the belief in her lover’s inconstancy. 
She will take up belief in her story (or at least increase its probability which 
was, by hypothesis, initially very low). Could it ever be rational for her to 
do sor I think so. Her new ‘theory’ has more explanatory power than her 
old (which had none with respect to the relevant phenomena—there was 
no explanation of the puzzling behaviour) but is taken to be more probable. 

But this cannot be unless we postulate not only a change in probability 
value on evidence, but also the occasional alteration of the probability 
function itself. When the evidence suddenly falls into place, as it were, and 
we say ‘now I see!’ we at least seem to have a better understanding of the 
world. And this seems to be an epistemic matter as well, in the sense that 
this new understanding affects the belief-worthiness of our world picture. 
Further, it is the articulation of the new theory rather than the accumulation 
of new evidence that occasions this response. The sudden new way of 
looking at the world reorders one’s plausibility scale. 

This new plausibility scale is no more than a new subjective probability 
function, within which the values assigned to various statements may 
have radically changed from the values assigned by the previous function. 


Credibility, Confirmation and Explanation 313 


The change may be radical in that it does not follow the rule of con- 
ditionalisation—the change may occur without the receipt of any new evi- 
dence, merely with the articulation of a theory. The change may also not 
follow the rule of so-called ‘Jeffrey’s’ conditionalisation; that is, the values 
of conditional probabilities may change as well as that of absolute proba- 
bilities—the new theory may well suggest that evidence bears in ways that 
differ from those codified by the original probability function. 

This last sort of change is one that seems to occur very frequently in 
science. One simply does not see the relevance of, say, increased pollution 
to the colour of moths until one has articulated some sort of selection theory. 
A detailed working out of one example of this can be found in Harper 
[1983]. There it is shown how the unification of knowledge brought about 
by Newton’s theory of gravitation suddenly made such previously irrelevant 
facts as the motions of Jupiter’s satellites evidentially relevant to Galileo’s 
laws of projectile motion. 

Perhaps the change in plausibility ordering which occurs upon the arti- 
culation of a new theory is like cases of suddenly reordered perception. 
This is the more likely if, as the cognitive psychologists suggest, perception 
is in part a matter of unconscious hypothesis generation and testing. It 
happens often enough that a scene changes its appearance radically without 
the input of new evidence or data but simply because a new interpretation 
strikes us. This view is reminiscent of Hanson’s in his [1958]. The book is 
better remembered for the notions of observation and theory-ladenness, 
but perhaps it is more important to consider its examination of occasions 
of sudden changes in our way of looking at the world brought about by 
theory articulation. In fact, Hanson explicitly likens the apprehension of a 
new theory to re-interpretation of the world. Talking about Kepler he says: 
‘the difference between “‘librations vs. ellipse” and “‘librations = ellipse” 
is like the difference between the bird and the antelope...’ ({1958], p. 83; the 
last remark refers to one of those now philosophically familiar ambiguous 
figures). 

This sort of change in plausibility assignment is also something like what 
sometimes happens while reading literary criticism. A new interpretation 
of a character may suddenly strike us and be plausible simply because of 
the way everything ‘fits in’ on that interpretation. Needless to say, literary 
critics hope to convince us that their interpretation is correct not merely 
that it ‘saves the phenomena’ (for when the novel is laid out for you 
when you start, you start with the phenomena being saved or else your 
interpretation does not start at all). They aim not only to appeal to our prior 
notions of plausible interpretations vra the appeal to mew evidence (this 
can happen of course) but also to alter our notion of the plausibility of 
interpretations. 

Although this sort of change is most obvious in cases of sweeping theor- 
etical revolution, a possible simple example is provided by the case of 
cryptograms. Suppose a complex signal was received by one of our radio 


-314 William Seager 


telescopes (which even now have sometimes been expressly aimed at stars 
with the purpose of listening for signs of intelligence, and lore has it that 
spare moments between scheduled reseach are spent listening to possible 
sources). The proposition that this signal came from an intelligent source, 
and thus that there are intelligent extra-terrestrials, would be given low 
probability (since any interpretation of the signal as natural would be 
preferred—the reasons for this are probably complex and interesting, having 
to do with the perceived importance of the hypothesis). But if someone 
could articulate a theory or code which made the signal into a sensible 
message (say, a complex mathematical sequence) then the probability that 
it was indeed a message would go up substantially (given that the theory 
was reasonable) even in the absence of further information. We can repre- 
sent this in the terms we have used in our discussion. We have an initial 
probability assignment, P(A|E), which is low, where A is the proposition 
that there are alien intelligent beings, and E is the proposition that the 
scientific community is receiving the evidence provided by the signal itself. 
The introduction of the further hypothesis regarding the coding scheme 
can be represented by C (let us say that C is the proposition that aliens 
are using coding scheme x). We then seem to have P(A & CIE) fairly high, 
which would be impossible on the grounds of simple probability theory. 
Thus it must be that the articulation of the coding scheme hypothesis 
somehow alters the probability function itself. 

One could object here that the coding scheme hypothesis is somehow 
more on the side of evidence than theory, but this cannot be true m any 
straightforward sense. For C amounts to no more than: there are intelligent 
aliens and they are using coding scheme x;’ thus if it were evidence it would 
be certain that intelligent extra-terrestrials exist and we are not in that 
position after the articulation of the coding hypothesis. 

It is more plausible to object that the new evidence acquired is not C’s 
content but rather its mere existence. That is, we now have the additional 
evidence that such an hypothesis is in the field. What we learn is the relation 
between C and E (the relation may be either that of C explaining E or E 
confirming C, or even some more esoteric relation of confirmation theory) 
which until the articulation of C had escaped our attention. This picture 
suggests the approach taken by Daniel Garber [1983] in answer to the so- 
called problem of old evidence. His approach involves the assignment of 
non-trivial probabilities (i.e., values strictly between o and 1) to statements 
of logical entailment between sentences. In our example we do not have 
such a strong logical relation as entailment but we may still attempt a 
solution along Garber’s lines. Then the stages of our problem may be 
expressed as follows. We begin with P,(A) very low. Then £ is received 


' I don’t see how one can avoid having the coding hypothesis entail the existence of an 
intelligent source, for any use of the terms ‘means’, ‘is to be interpreted as’, ‘was intended 
to represent’, etc., clearly carries the implication of an intelligent source. Thus the coding 
hypothesis is in essence no different than the conjunction A & C itself. 


Credibility, Confirmation and Explanation 315 


and accepted and learning proceeds by conditionalisation. So P,(A) = 
P)(A|E), which is still low. Then C is formulated and we see that it 
stands in some special relation to E (the relation of confirmation, or expla- 
nation or whatever) which we may symbolise as R(CE). Then 
P,(A) = P,(A|R(CE)), which finally is a respectably high probability. The 
changing probability of C will follow a very similar path. 

We may now explain the above seeming paradox of P(A & C|E) being 
greater than P(A|E) as a confusion about the available evidence. What we 
have is that P(A & C|E & R(CE)) is greater than P(A|E) which is no paradox 
at all. 

I believe that such an approach succeeds formally in restoring the struc- 
ture of conditionalisation on a single probability function to confirmation. 
I also think that such an approach could never fail to succeed, no matter 
the example. This is because the value of the conditional probability on 
evidence involving the relation R, as in P(A|R(CE)), is simply chosen to 
solve the given problem. Where I speak of altering the probability function, 
this approach speaks of the value of this conditional probability, both 
equally mysterious. Indeed, on the face of it, it is far from clear why learning 
that a certain hypothesis stands in a certain relation to certain evidence 
should enhance our degree of belief in anything (beyond, of course, state- 
ments reporting this relation). For example, it was well known that the 
Copernican hypothesis explained the long observed retrograde motion of 
Mars. This did not enhance the probability of Copernicanism in most 
people’s minds, as is illustrated by the above quote from Bellarmino. He 
is well aware that the Copernican hypothesis and the evidence of Mars’ 
retrograde motion stand in our relation R, but does not take this to increase 
the credibility of Copernicanism. Indeed, there are any number of possible 
and contradictory hypotheses which stand in this relation to the evidence 
and they cannot all enjoy an increase in probability upon their mere 
announcement. ° 

It took long schooling before people’s conditional probabilities fell into 
line with the above story of confirmation. When the Copernican picture 
took hold of people’s minds and seemed plausible in itself, then the evidence 
confirming it could be marshalled to good effect. 

I also note a somewhat curious fact. It seems reasonable to hold that the 
relation of confirmation between C and E is not relevant to whether £ is 
true or not, that P(E|R(CE)) = P(E). For why should learning that EF con- 
firms some hypothesis alter the probability of E? (Example: the hypothesis 
that there is life on Mars stands in this confirmation relation to the prop- 
osition that there are canals on Mars. Now, does learning that alter the 
probability that there are canals on Mars?) Secondly, in our example, would 
learning that there are intelligent aliens alter the probability that R(CE) 
given E? There seems no reason to suppose so. Then, P(R(CE)|E & 
A) = P(R(CE)|E). From these two apparently reasonable premises it is easy 
to prove that R(CE) is irrelevant as extra evidence, t.e., P(A|E & R(CE)) 


316 William Seager 


= P(A|E).' If this is correct then the mere recognition that the hypothesis 
stands in relation R to the evidence should not increase the probability that 
there are intelligent aliens. Hence the need to view the apparent shift in 
probability as a shift to a new probability function or plausibility scale. Of 
course, we can assign values which falsify the needed premises, but there 
seems little justification for this except to save the traditional line on con- 
firmation and learning. I suggest it is more plausible to consider this a 
genuine difficulty for the traditional theory. 

The view I am defending then is no more than that we engage in science 
in order to interpret a puzzling world. Like the literary critic or the cryp- 
tographer we start with and must abide by what the ‘book of the world’ 
provides us—-we must indeed save the phenomena. But as we know that 
the characters in a novel must be more complex than mere producers of the 
behaviour ascribed to them and no more, we know that there is more to the 
world than what we observe. Our interpretations take this into account, 
and, strikingly, put into order the world of appearance and go some way 
toward maintaining this order into future appearances. A new interpretation 
may strike us as providing a better way to order the world and when it 
strikes us thus it tpso facto becomes more plausible. This is why a theory 
can seem more likely than a proper part of it for until the theory is articulated 
that part may be assigned a very low probability (perhaps, as an example, 
the bald assertion that the earth moves around the sun in 1400). But, of 
course, once the new plausibility scale emerges this part will partake in the 
general increase in probability.” 

If reinterpretation changes probability assignments and if increase in 
explanatory power can lead to reinterpretation then explanatory power can 
be a confirmational as well as an informational virtue. But if it is true that 
new theories or new interpretations actually alter rational individuals’ or 
groups’ probability functions then the question of justification naturally 
arises. For while it is true that after the reinterpretation things ‘hang 


! The proof would go somewhat as follows: 


. P(AIE & R(CE)) P(A & E & R(CE))/P(E & R(CE)) 

; P(A & E) x P(R(CEJA & E) PE & R(CE)). 

P(A & E) x P(R(CE)|E)/P(E & R(CE)). 

P(A & E) x P(R(CE)|P(E & R(CE)). 

P(A & E) x P(R(CE))/P(E) x P(R(CE)). 

P(A & E)/P(E). 

= P(AIB). 

Where 3 follows by our third premise, 4 by Bayes’ Theorem and the second premise. 

Of course, one cannot predict these changes in one’s plausibility scale beforehand, and this 
is of some importance, for van Fraassen has shown that if one has estimates of future changes 
in one’s probability function then one can be shown (now) to be irrational in a technical 
sense (see his [1984]). But it seems that if one has such estimates then one has already altered 
one’s probability function in some way (see Sobel’s [1985] for a discussion of this point). If 
one then denies this (by maintaining the integrity of the old probability function now) then 
it hardly seers strange that one can be convicted of irrationality. 


Vudu non 


NOPU 


w 


Credibility, Confirmation and Explanation 317 


together better’, and we have more evidence for certain of our pet propo- 
sitions than we had before the reinterpretation (we are more secure, as 
Harper would put it), how do we know that we won’t reintrepret away all 
this security at a later date? Hume asked how one could justify a certain 
inductive practice and suggested that there was no real answer—that this 
inductive practice was built into the human mind. 

Similarly, just why certain views seem more plausible than others to us 
is a difficult question to answer. There was a time when ‘demon-possession’ 
was a plausible explanation of certain sorts of behaviour. Why does this 
seem no longer plausible to us? The answer would seem to be that our 
present theories provide no room for demons in our world. Our present 
interpretation of the world precludes them. But this is an instance of a 
plausibility scale being reordered via the introduction of theory not a jus- 
tification for it. 

However, the full scale solution to the problem of the ‘reliability of 
reinterpretation’ cannot be attempted here. I will rest content with the 
reconciliation of credibility and confirmation which ‘theory driven plausi- 
bility re-ordering’ can provide. I think there can be little doubt that plausi- 
bility scales do alter with the articulation of new theories. It also seems 
clear that what we judge to be plausible we judge to be worthy of belief 
(more or less). Accepting a theory is then, in part, accepting a way of looking 
at the world. A new way of looking may suggest the modification of old 
likelihoods—and this is a matter of what we take to be true, or as likely to 
be true. 


University of Toronto 


REFERENCES 


EARMAN, JOHN [1983]: Testing Scientific Theories, Minnesota Studies in the Philosophy of 
Science, Vol. 10. University of Minnesota Press. 

GARBER, DANIEL [1983]: ‘Old Evidence and Logical Omniscience in Bayesian Confirmation 
Theory’, in J. Earman [1983]. 

GINGERICH, OWEN [1982]: “The Galileo Affair’ in Sctentific American, Vol. 247, 2 (August). 

GLymour, CLARK [1980]: Theory and Evidence. Princeton University Press. 

Hanson, N. R. [1958]: Patterns of Discovery. Cambridge University Press. 

HARPER, WILLIAM [1983]: ‘Consilience and Natural Kinds’, Xerox. 

Horwicu, PAUL [1982]: Probability and Evidence. Cambridge University Press. 

SALMON, WESLEY [1984]: Scientific Explanation and the Causal Structure of the World. Prin- 
ceton University Press. 

SOBEL, HOWARD [1985]: ‘Self Doubts and Dutch Strategies’, Xerox. 

VAN FRAASSEN, BAS [1980]: The Scientific Image. Clarendon Press. 

VAN FRAASSEN, BAS [1983a]: “Theory Comparison and Relevant Evidence’, In J. Earman 
[1983]. 

VAN FRAASSEN, Bas [1983b]: ‘Glymour on Evidence and Explanation’, in J. Earman [1983]. 

VAN FRAASSEN, BAS [1984]: ‘Belief and the Will’, in The Journal of Philosophy, 81, pp. 235— 
56. 


Brit. Y. Phil. Sci. 38 (1987), 319-337 Printed in Great Britain 319 


Exploratory Factor Analysis, 
Instruments and the Logic of 
Discovery 

by DAVIS BAIRD 


Introduction 

An Introduction to Factor Analysis 

Factor Analysis does not Produce Hypotheses 

Information- Transforming Instruments 

Factor Analysts as an Information-Transforming Instrument 
Factor Analysts, Instruments and the Logic of Discovery 
Notes 

References 


ox au bh W NH 


I INTRODUCTION 


Philosophers distinguish the logic of discovery from the logic of justi- 
fication. The logic of justification usually is understood as a set of canons 
of inference for testing—either to confirm or refute—an explicit hypothesis. 
In contrast, the logic of discovery, usually is understood as a set of canons 
of inference for generating an hypothesis from data. From early in this 
century until recently it was common for philosophers to deny the possi- 
bility of a logic of discovery; there could be psychologically useful aids to 
promote discoveries, but these methods, and the hypotheses they generated 
could not be understood as privileged in any way. According to Larry 
Laudan, prior to the turn of the century, the circumstances through which 
an hypothesis was discovered provided the primary reason for its accep- 
tance. The logic of discovery was a logic of epistemic warrant. Laudan 
worries that the recently revived interest in the logic of discovery cannot 
serve this same purpose: 


The older program for a logic of discovery at least had a clear philosophical problem 
of providing an epistemic warrant for accepting scientific theories. The newer 
program for the logic of discovery, by contrast, has yet to make clear what philo- 
sophical problems about science it is addressing. . . . If this chapter provides a partial 
answer to the question ‘Why was the logic of discovery abandoned?”’, it poses 
afresh the challenge: ‘“Why should the logic of discovery be revived?” (Laudan, 
[1981] pp. 190-1). 


Factor analysis usually has been described as a partially mechanical 


320 Davis Baird 


method for generating hypotheses from a large number of basic measure- 
ments; as such, it appears to be part of a logic of discovery. Recently, 
however—following philosophical arguments against the possibility of a 
logic of discovery—the claim that ‘exploratory’ factor analysis is a special 
or privileged method of discovery has been challenged. Confirmatory, as 
opposed to exploratory, factor analysis has been offered as an alternative 
factor-analytic method for testing hypotheses. 

I have two aims in this paper. The first, is to describe and evaluate 
arguments against understanding exploratory factor analysis as a part of 
the logic of discovery. I argue, that the logic of discovery should not be 
understood simply as canons for generating hypotheses. Although this is 
contrary to the usual understanding of the logic of discovery, we discover 
many things besides hypotheses which a proper logic of discovery must 
take into account. Consequently, the arguments used to criticise exploratory 
factor analysis mislead us because they assume that exploratory factor 
analysis must generate an hypothesis. 

My second aim is to provide an answer to Laudan’s challenge. One 
philosophical problem a revived logic of discovery should be interested in 
concerns the discovery of evidence, not hypotheses. I develop the concept 
of an ‘information-transforming’ instrument. Such instruments transform 
and summarise information input to the instrument into a form more useful 
for further research in the output. Microscopes, for example, transform the 
information present in cells into a form that makes the information needed 
to distinguish chromosomes from other parts of the cell readily accessible 
to human examination. Such transformed information can provide evidence 
that subsequently may be used to test or suggest hypotheses. First, however, 
we must discover the evidence itself. By presenting information usefully, 
information-transforming instruments allow us to discover not hypotheses, 
but evidence. 

Exploratory factor analysis should be understood as an information- 
transforming instrument. We should interpret the result of an exploratory 
factor study as potential evidence rather than an hypothesis. As such, 
exploratory factor analysis could become a justified part of the logic of 
discovery. 

I am not concerned here to address the substantive question of whether 
or not exploratory factor analysis is in fact such a justified part of the logic 
of discovery. Microscopes are instruments of discovery about the very 
small. Factor analysis is an instrument of discovery about the ensemble; 
with factor analysis we discover linear common factor structure in a popu- 
lation of correlated measurements. This does not imply that knowledge 
about linear common factor structure will teach us important lessons about 
the world. The early microscopes were poor tools for looking at the micro- 
world. Very possibly current exploratory factor analysis is a poor tool for 
looking at population structure. Its faults, however, do not lie in the fact 
that it is some kind of bogus ‘logic’ of discovery. f 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 321 
2 AN INTRODUCTION TO FACTOR ANALYSIS 


Classical factor analysis Factor analysis is used to find patterns in col- 
lections of correlations. For example, suppose six tests are given to each 
person from a large population; tests 1, 2 and 6 are three tests of different 
verbal abilities and tests 3, 4 and 5 are three tests of different numerical 
abilities (Joreskog and Sorbum [1979], ch. 1). Correlations are computed 
between scores on the various tests. These correlations can be tabulated in 
a matrix. (I suppress here issues of statistical inference by assuming that 
the true population correlations are available): 


x x2 X3 X4 Xs Xe 
xy" I > 
xz .720 I 
x3 378 .336 ı 
X4 324 .288 .420 1 
Xs 270 .240 .350 .300 I 
X6 270 .240 .126 .108 .ogo I 


Thus the correlation between scores on test 2 and test 4 is 0.288. 

Suppose all the test scores were correlated. A factor-analytic solution 
would decompose each test score as a linear combination of a single factor, 
f (common to all of the six tests), and a unique factor, e; (unique for each 
particular ith test): 


x =afte; x. = afte; etc. 


On the other hand, test-score correlations might fall into two groups. In 
such a case, a factor-analytic solution would decompose each test score as 
a liner combination of two common factors, f' and f?, and a unique factor: 


x= afi +b f’ +e; x= af'+b,f?+e,; ete. 


If the correlations fall into three groups, the factor solution would decom- 
pose test scores as linear combinations of three common factors; etc. No 
matter how many common factors are involved, the factor equations relating 
common factors to test scores always are linear. 

Constraints are imposed on these linear relationships. Common factors 
are assumed to have zero correlation with the unique factors. Unique 
factors are themselves assumed to be mutually uncorrelated. Perhaps most 
important, the correlation between scores on two different tests conditional 
on a fixed value of all their common factors is assumed to be zero; the 
common factors ‘account for’ all the correlation found between observed 
test scores. 


322 Davis Baird 


By imposing these constraints, it is possible to determine the minimum 
number of factors needed to ‘explain’ all the correlations between observed 
test scores. It can be shown, for instance, that the correlations in the example 
above cannot be accounted for by a single factor; two common factors are 
required. This is a constructive process. Specific factors are found such that 
observed test scores are a specific linear function of these factors. This is 
the initial solution to a factor analysis. 

The solution discovered at this stage of the factor analysis generally is 
not unique. Other specific factors can be found that can be employed in 
other linear equations to reproduce observed test scores equally well. A 
‘rotated’ or ‘optimal’ solution is chosen from among these equivalent initial 
solutions according to what has been called the ‘principle of simple struc- 
ture’. This notion was first developed by L. L. Thurstone [1947]. A modern 
text describes it as follows: 


The matrix of factor loadings [coefficients attached to the common factors in the 
linear equations] shall have as many zero elements as possible. . .. A variable should 
not depend on all the common factors but only on a small proportion of them. Also 
the same factor should be involved only in a small proportion of the variables. Such 
a matrix is regarded as giving the simplest structure and presumably the one with 
the most meaningful psychological interpretation (Joreskog and Sorbum, [1979] p. 
12). 


In the example above, there is a two-factor solution with simple 
structure: 


x, = 0.9f'+of?+e, =o0.9f' +e, 
x = 0.8f' +f +e, = 0.8f'+e, 
x = of '+0.7f? +e; = 0.7f? +6, 
of '+0.6f? +e, = 0.6f? +e 
x; = of '+o0.5f? +e, = 0.5f? +e, 
x = 0.3f +of* +e, = 0.4f' +e 


i 


X4 


With many factor loadings [slope coefficients] equal to o, each common 
factor relates only to observed scores on three tests, and each observed test 
score depends on only one common factor. Indeed, a tempting psycho- 
logical interpretation is that f' is a verbal factor while f? is a numerical 
factor. 

This, in brief, is ‘classical’ factor analysis. Each of the steps—from the 
initial matrix of correlation coefficients to the determination of an initial 
solution (by means of the constraints imposed by the assumption of linear 
common-factor structure), and from the initial solution to the rotated solu- 
tion (by means of the principle of simple structure)—can be automated. 
Factor analysis appears to be an automatic way to convert ‘raw data’— 
the correlation matrix of test scores—into ‘deep structure’ explanatory 
hypotheses. For example, each person has a basic verbal ability (factor f' 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 323 


above) and a basic numerical ability (factor f*). Thus, factor analysis seems 
a good candidate for a logic of discovery. 

There are important and difficult questions about sampling and, conse- 
quently, estimating factor solutions. There are important issues involved 
in testing the adequacy of the assumptions that create the constraints from 
which a solution is ‘extracted’. There are important problems with the 
assignment of factor values (‘factor scores’) to be individual test takers. 
While all these issues pose interesting—and difficult—statistical and philo- 
sophical problems, they are not crucial to my inquiry into factor analysis 
as a part of a logic of discovery. 


‘Confirmatory’ factor analysts Factor analysis takes only the matrix of test- 
score correlations as input. It produces a final common-factor solution as 
output. Thus, it proceeds in a virtual vacuum of substantive theory. Of 
course, there are, assumptions built into the process; but these assumptions 
cannot be understood as substantive theoretical assumptions for they have 
no particular subject matter. Because factor analysis can proceed in a 
theoretical vacuum, it is a good candidate for a purely empirical method. 
But, the state of research eventually should need to incorporate substan- 
tive theoretical assumptions. Suppose, for example, that psychological 
researchers were certain that humans have two primary mental abilities— 
numerical and verbal, factor analysis, as practiced through the mid-1960s, 
could not incorporate such information; at best it could corroborate such 
information. During the 19608 this limitation was removed and ’con- 
firmatory’ factor analysis was born. 

In a series of papers Karl Joreskog, among others, developed a general 
factor-analytic model and a series of statistical techniques for fitting cor- 
relational data to this model (Joreskog [1966]; [1967]; [1969]; Joreskog 
and Lawley [1968]). Joreskog’s method allows researchers to stipulate in 
advance as many constraints, motivated by substantive theory, as desired. It 
then produces the ‘best fitting’ fully specified model within these previously 
established constraints. Finally, the model produced is tested against the 
data with a generalised chi-square goodness of fit test. All these innovations 
have been built into a series of increasingly more flexible computer pro- 
grams (most recently, LISREL6; see Long [1983]). 

Joreskog’s work introduces two distinct innovations. First, he provides 
a more flexible method for incorporating substantive theory into the search 
for a factor analysis of correlational data. Second, his method tests the 
goodness of fit between the constrained factor model and the data. Each of 
these innovations has been cited as the crucial improvement ‘confirmatory’ 
factor analysis brings to the classical model (see Long [1983], pp. 12, 15 
and 68; and Mulaik [1985a], p. 1). Thus, Joreskog’s work, taken at face 
value, is neutral with respect to the distinction between a generationist logic 
of discovery and a confirmationist logic of justification. The increased 
flexibility of Joreskog’s (confirmatory) factor-analytic model is of little 


324 Davis Baird 


philosophical interest. The philosophically interesting question is whether 
or not ‘confirmatory’ factor analysis, practiced as a post hoc test procedure, is 
a better methodology than exploratory factor analysis—because exploratory 
factor analysis attempts to be an impossible logic of discovery. 


For the confirmatory interpretation of ‘confirmatory’ factor analysis Stanley 
Mulaik argues that confirmatory factor analysis is an important innovation 
precisely because it finally signals, 100 years later than in the natural 
sciences, the recognition by factor analysts that there can be no privileged 
exploratory methods in science. 


Confirmatory factor analysis was perhaps the first of many confirmatory methods 
now available to the researcher in the realm of multivariate statistics. The rise of 
these confirmatory techniques reflects a shift in emphasis from purely inductive 
approaches to hypothesis testing approaches, which mirrors a comparable shift that 
occurred over a century and a half ago in the physical sciences (Mulaik [1985a], 
p. 52). 


Mulaik does not argue that researchers should not use exploratory methods, 
just that there is nothing special about the hypotheses generated by an 
exploratory study: 


Scientists who may find some use for exploratory statistics in provoking hypotheses 
should realize that looking at patterns in data is not the only basis for the generation 
of hypotheses and may often yield ambiguous results (Mulaik, [198s5b], p. 427). 


For Mulaik, it is the subsequent testing of an hypothesis that distinguishes 
it, not the method that produced it. In short, there is no logic of discovery; 
there are only useful aids in bringing about discoveries. 

Mulaik presents two fundamental arguments against exploratory factor 
analysis. First, he urges that there are ‘no rationally optimal ways to extract 
knowledge from experience without making certain prior assumptions...’ 
(Mulaik [1985a], p. 13). Since prior assumptions must be made, exploratory 
factor analysis cannot simply ‘produce’ knowledge from raw data; worse, 
the prior assumptions may be wrong as well, and invalidate the whole 
procedure. Second, Mulaik points out that the factors produced by factor 
analysis do not come with an established interpretation; the only concepts 
available for giving them an interpretation are the interpretations assigned 
to the original test variables. Thus, factor analysis cannot provide new 
deep-structure concepts which could be used in explanations of surface 
level correlations. 

Mulaik shows that exploratory factor analysis fails as an inductivist 
method—as he understands such methods. According to Mulaik, induc- 
tivism, is a method whereby a careful examination of ‘bare data’, unpreju- 
diced by the prior adoption of any hypotheses, produces incorrigible 
hypotheses (Mulaik [1985a], p. 2). The method both produces the results 
and simultaneously warrants their incorrigibility. Surely the results of an 
exploratory factor analysis are not incorrigible. Surely the linear assump- 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 325 


tions prejudice the analysis of ‘bare data’. Surely exploratory factor analysis 
does not produce results free of ambiguity. Factor analysis is not a good 
Mulaik-inductivist method. 

Mulaik argues that much of factor analytic technique can be reinterpreted 
and salvaged as a hypotheticist technique. In contrast to inductivism, hypo- 
theticism examines data in light of previously entertained hypotheses. The 
point, however, either is to confirm or to falsify the entertained hypotheses; 
it is not to generate new hypotheses. Understood as a hypotheticist tech- 
nique, Muliak’s criticisms no longer are compelling. That assumptions are 
necessary to run a factor analysis, does not pose a problem, since one 
assumes that assumptions are made, and tested, by hypotheticist methods. 
Similarly, the need to interpret the results does not pose a problem, since 
one assumes that the interpretation of the factors is established prior to 
running a factor analysis—to test this interpretation. Joreskog’s ‘con- 
firmatory’ factor model is important for Mulaik precisely because it can be 
used as a confirmatory method, and not simply because it can be used as a 
more flexible exploratory method. 


3 FACTOR ANALYSIS DOES NOT PRODUCE HYPOTHESES 


The most important problem with Mulaik’s arguments is his exclusive 
focus on hypotheses. A Mulaik-ian scientist must be either generating or 
testing hypotheses. But scientists do many other things: they genetically 
engineer cells to make insulin; they develop instruments to allow them to 
observe the solar corona; they investigate the mathematical properties of the 
‘general linear model’. Factor analysis may be a poor method of hypothesis 
generation, but it need not be inferred from this that it is a poor method of 
discovery and therefore should be reinterpreted as a method of hypothesis 
testing. 


Hypotheses must be interpreted As Mulaik points out, the result of a factor 
analysis is, just a set of equations. The common factors, in particular, do 
not come with a substantive interpretation. This is just what we would 
expect: factor-analytic technique is autonomous and involves no particular 
substantive theory. The same input correlations and output linear equations 
could have to do with human mental abilities or tornado behavior. Mulaik 
concludes that the hypothesis that results needs interpretation. I conclude 
that the ‘hypothesis’ which needs interpretation is no hypothesis; it is 
simply a convenient or useful presentation of certain patterns in the input 
correlation matrix. 

Factor Analyses simply re-expresses correlational data. Consider the 
following: 


In any case, exploratory component analysis, like exploratory factor analysis, is a 
mathematical tautology superimposed upon a set of variables, and as a tautology it 


326 Davis Baird 


cannot be disconfirmed or rejected by any set of observed data (Mulaik [1985a], 
pp. 20-1). 


I would say instead that both exploratory factor analysis and its variant, 
exploratory component analysis, are particular methods for re-expressing 
correlational data in a particular kind of summary fashion. Such summaries 
may or may not be useful, depending on where the data come from in the 
first place and what a particular researcher wants to do with them. Such 
summaries by themselves are not hypotheses. 

For example, the United States bureau of the Census’s Historical Stat- 
istics of the United States, tells us that the population of South Carolina in 
1930 Was 1,739,000 (p. 34), and that South Carolina occupied 30,495 square 
miles (p. 38). We can compute that there were 57.0 people per square mile 
in South Carolina. This fact, as it stands, is not an hypothesis. It is simply 
a restatement—a particular numerical summary—of accepted information. 
One does not say, “The land area is 30,495 square miles and the population 
is 1,739,000 persons, so I hypothestse that the population density is 
1,739,000/30,495 persons per square mile.’ There is nothing hypothetical 
in the computed population density. 

Claims about the population density can be used as hypotheses. This is 
most common when laws relating population density and some other vari- 
able are hypothesised. Even as a simple statement, the claim that South 
Carolina had a population density of 57.0 persons per square mile could be 
understood as an hypothesis. This can happen, for example, when we 
discover that this value differs from the population density reported on 
page 34 (of 56.8 persons per square mile). Then we can hypothesise that the 
computed value is correct and the other value is wrong. Such an hypothesis 
would be confirmed or refuted by discovering the source of the dis- 
agreement between the two figures. In such a case the computed value (and 
the numbers from which it was computed) may be rejected; in contrast, 
when the number simply is a computed ratio of accepted values, it is not 
open to rejection—unless the arithmetic is done poorly. 

The case of factor analysis is a bit more complicated, but not different in 
principle. The complication arises from the fact that there are choices 
available for fixing the rotated or optimal solution. So it might appear that 
one factor solution might be right where the others are wrong. I prefer to see 
them all as different, analytically specified, summaries of the correlational 
information input to the analysis. Some particular summaries may prove 
more useful than others. But there are a variety of ways in which a factor 
analysis can be useful and, important for my current point, the question of 
the usefulness of a particular factor analytic summary is extrinsic to the 
summary itself. 


The problem of realism Do the common factors that result from a factor 
analysis describe any real structure? In a recent book, S. J. Gould takes 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 327 


strong issue with factor analysis on this point (Gould [1981], ch. 6). Factor 
analyses of correlations between tests of many different mental abilities 
produce a single common factor, g. Researchers then seem to infer, illegit- 
imately according to Gould, the existence of a human property called 
‘general intelligence’. 

Gould likely is right concerning the social and political uses and abuses 
of intelligence quotient research; but, in their own technical presentations, 
intelligence researchers have been much more circumspect. In his most 
mature treatment of human intelligence and factor analysis, C. S. Spearman 
wrote: 


But notice must be taken that this general factor g, like all measurements anywhere, 
is primarily not any concrete thing but only a value or magnitude. Further, that 
which this magnitude measures has not been defined by declaring what it is like, 
but only by pointing out where it can be found (Spearman [1927], p. 75). 


Where this factor can be found, of course, is concealed among the cor- 
relations between tests of many different mental abilities; it is revealed by 
factor analysis. Spearman may have thought g was real, but he also recog- 
nised that factor analysis was not able to tell him what it was. Cyril Burt, 
perhaps the most famous—and infamous—of factor analysts, also empha- 
sised the need to include substantive theoretical and empirical con- 
siderations distinct from factor analyses, to properly understand the results 
of a factor analysis (Hearnshaw [1964], p. 204). 

I do not claim that these brief remarks settle either the historical question 
of how researchers interpreted the common factors, or the contemporary 
problem of how we should interpret them. The sustained confusion on this 
issue suffices to make my point. Factor analyses do identify real patterns in 
a correlation matrix. These patterns may not persist in the correlation 
matrix of replications of the tests; or, if the patterns do persist, they may 
be artifacts of the correlational summarising of the test data and the factor- 
analytic summarizing of the correlations. The patterns are real enough, 
even if they do not correspond to a natural piece of the real world. Further 
substantive theoretical and empirical investigation is necessary to determine 
if these real patterns connect with some specific part of the real world. 


4 INFORMATION-TRANSFORMING INSTRUMENTS’ 


Factor analysis ought to be understood as an instrument instead of as a 
method of hypothesis generation, confirmation or refutation. In order to 
make this claim clear, I present here six theses about scientific instruments. 
Of course, a tremendous number of different scientific instruments are put 
to a tremendous number of different uses. Of particular relevance to my 


' The following discussion about instruments is indebted to Hacking’s discussion of micro- 
scopes. My specific theses are not Hacking’s, although they are generalizations of points 
Hacking makes about microscopes. See Hacking [1983], ch. 11. 


328 Davis Baird 


discussion are what I shall call ‘tnformation-transforming instruments’. These 
instruments transform information present in the input to the instrument 
into a more useful format at the output end of the instrument. An example 
is the microscope which transforms information present in the input prep- 
aration of thinly sliced and dyed plant cells into an output image at the eye 
piece where previously indistinguishable structures may be distinguished 
by the human eye. 

Information-transforming instruments are important because they are 
used to mould and, in a certain sense, create empirical data. With such 
moulded data we begin to understand nature; and with such data we con- 
front nature with our various hunches, predictions, speculations and 
hypotheses. Microscopes do such duty, as do a dazzling variety of the 
products of scientific experimentation—including factor analysis. Here I 
briefly characterise these workhorses of empirical science. 


Thesis 1: Hypotheses not output Instruments do not produce or directly 
test hypotheses. An image produced by a microscope is not an hypothesis; 
it is optically re-expressed information from the input sample. Certain 
hypotheses may be involved in the uses to which a microscopic study might 
be put. One researcher might hypothesise that a certain feature of the output 
image represents a mitochondrion; another might disagree, believing the 
feature to be an artifact. But, such hypotheses are about the output; the 
output itself is simply another bit of nature about which to hypothesise. 

Indeed, the output from a microscope frequently provides the empirical 
data with which to test an hypothesis, or from which an hypothesis suggests 
itself. Chromosomes are so named because they appear as dark structures 
in a stained preparation of cells (Portugal and Cohen [1977], p. 36). But a 
dark structure in the output of a microscope is no hypothesis; it is simply 
a dark structure. By observing the behaviour of these dark chromosomes 
during cell division, hypotheses did in fact present themselves about the 
role of chromosomes in heredity. The output image from the microscope 
provided the data which suggested the hypothesis; but the data was not 
itself an hypothesis. 


Thesis 2: Instruments summarise information Generally the amount of 
information in the output from an instrument is less than the amount of 
information in the input to the instrument. This is especially obvious with 
measurement instruments, A scale may output a single number from an 
input of a particular sample of mercury. There is, however, vastly more 
information present in the sample than in such a single number. The 
particular numerical summaries produced by a scale, however, have proved 
useful because the information they contain is useful in further empirical 
and theoretical studies. 

Some potentially important information is always lost is summarising. 
However, the fact that important information can be lost is not, by itself, 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 329 


a criticism of the instrument. It would be silly to conclude that light 
telescopes are bad instruments because they ignore information outside the 
visible spectrum—even when such information is important. All instru- 
ments output only some of the available information in their input. It is a 
contingent matter, for further empirical and theoretical study, to determine 
if and how the information retained is useful or important. It is not a 
criticism of an instrument to say that it ‘summarises away’ important 
information. 


Thests 3: Instruments can produce artifacts Instruments can output infor- 
mation that has nothing or nothing important to do with the bit of nature 
under study. Such spurious information usually is called an artifact of the 
instrument. Sometimes an artifact can be caused by ignored aspects of the 
input to an instrument. Improperly cleaned glass slides and cover slips can 
produce artifacts in microscopy; an impure sample thought to be pure can 
produce an artifactual spectroscopic analysis. At other times an artifact may 
be caused by the particular (imperfect) functioning of the instrument. Early 
microscopes were plagued by many sources of distortion—particularly 
chromatic and spherical aberration. As a consequence, much of the infor- 
mation output by these microscopes was artifactual. Only when these 
sources of distortion were greatly reduced did microscopes become useful 
instruments. 

Here again, the fact that an instrument produces artifacts is not, by itself, 
a criticism of the instrument. People learn to ignore—and, indeed, not 
even see—the artifactual reflections on their eyeglasses. In his chapter on 
microscopes, Ian Hacking describes a study which established that a certain 
structure (called ‘dense bodies’) identified by a low-power electron micro- 
scope was not an artifact of the microscope (Hacking, [1983] pp. 200 ff.). 
But had the dense bodies turned out to be artifacts, it still would have 
obviously been folly to dismiss the electron microscope. Equally obvious, 
it would have been a mistake not to investigate the status of dense bodies. 


Thesis 4: Instruments are best used interactively Hacking emphasises this 
point in his discussion of microscopy: 


This is the first lesson: you learn to see through a microscope by doing, not just by 
looking. ... We see the tiny glass needle—a tool that we have ourselves hand crafted 
under the microscope—jerk through the cell wall. We see the lipid oozing out of 
the end of the needle as we gently turn the micrometer screw on a large thoroughly 
macroscopic, plunger (Hacking [1983], pp. 189 ff.). 


We come to understand and trust the output of an instrument as we use 
it interactively: manipulate the input and watch the output. We learn 
about instruments, and we learn to learn from instruments, by doing with 
instruments. : 

The manipulation Hacking describes might be called ‘real-time manipu- 


330 Davis Baird 


lation’; he simultaneously manipulates and watches. But not all manipu- 
lation needs to be, or can be, done in ‘real-time’. Early microscopy had 
to stain cells with dyes that killed the cells. Such samples could not be 
manipulated in the manner Hacking describes. However, we can manipu- 
late sequential samples of similar material. We can input samples from 
different stages in some on-going process to produce a series of ‘snapshots’ 
of a carefully manipulated event. The early studies of egg fertilization were 
done this way (Portugal and Cohen [1977], p. 37). No doubt real-time is 
preferable, but it is not the only way to learn by doing. 


Thesis 5: Preparation of input The input to an instrument usually must be 
prepared specially for the instrument. Instruments summarize by ‘focusing’ 
only on certain kinds of properties in the input. Information expressed in 
terms of other properties is destroyed. If the relevant information in the 
input—as it occurs naturally—would not be retained by the instrument, 
the input needs to be prepared so this will not occur. 

For example, the output of an optical microscope relies on there being 
significant differences in the optical properties of the various parts of the 
input. Unfortunately, most cells are roughly uniformly translucent. Thus, 
to get useful information about cells from a microscope, the cells need to 
be ‘coloured’ in such a way that distinct structures show up distinctly. This 
is the purpose of differential staining. 

There need be no systematic theory of how or why certain preparations 
work. At least at the beginning stages of development and use, much 
preparatory work is accomplished largely through trial and error. In 
microscopy, 


[flor example, methyl-green, a basic coal-tar color [and one of the first chemicals 
used for staining], stained the nucleus green and the cytoplasm red. While many 
diverse cell structures might stain similarly with the same dye, a number of limi- 
tations of the technique were noted. Staining reactions were found to be affected 
by the methods used to prepare the tissues and by the chemical composition of the 
solutions used in the staining procedures. It was uncertain to what extent staining 
was a physical rather than a chemical reaction (Portugal and Cohen [1977], p. 36). 


Here was not an absence of theory of staining. But the theory of staining was 
one area of investigation, the use of staining was another. (Although, many 
biologists contributed to both areas of study.) Happily, success in the study 
of staining was not a prerequisite for success in the use of staining. 


Thesis 6: Theory of instrument different from theory using instrument 
Whatever theoretical understanding of an instrument’s operation we may 
have is an autonomous theoretical domain, distinguishable from the various 
substantive domains where the instrument is used. There now is a large 
discipline devoted to the study of microscopy. The discipline has its own 
internally generated problems, in addition to external demands from the 
various substantive disciplines that use microscopes. But the study of micro- 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 331 


scopes—or of any instrument—constitutes a domain of study separate from 
the various areas of biological research that use light microscopy. Thus, 
any particular kind of instrument is at once an object of study and improve- 
ment, and a tool for researchers in various fields of study. 

The successful use of an instrument does not require an understanding 
of the theory of its operation. A biologist may employ arguments from 
optics to convince a skeptic that a particular structure, apparent in the 
output from a microscope, is not an artifact. But such arguments from 
optics are not the only arguments available for such purposes. The status 
of the structure might be established empirically by finding the same struc- 
ture by other means. It might be established theoretically using substantive 
arguments from biological—not optical—theory. 


5 FACTOR ANALYSIS AS AN INFORMATION-TRANSFORMING 
INSTRUMENT 


Factor analysis is an instrument These six theses describe factor analysis 
as well as more conventional instruments like microscopes. Instead of 
producing, confirming or refuting hypotheses, it is quite natural to think 
of factor analysis as an instrument for transforming information. Infor- 
mation in the form of a correlation matrix is input to a factor analysis and 
an alternative form of expressing some of this information—which identifies 
particular kinds of linear patterns in the data—is output (Theses 1 and 2). 
It is also worth noting that factor analysis itself is an autonomous domain 
of study (Thesis 6). Similarly, Theses 3, 4 and 5 describe factor analysis 
(but more on this below). 

I do not claim that these six theses constitute a sufficient characterisation 
of information-transforming instruments. I do not argue that since these 
theses correctly describe factor analysis, factor analysis necessarily must be 
a scientific instrument. Rather, the point is to appreciate an important and 
virtually ignored aspect of science: the use of instruments to transform and 
summarise information. I claim that it makes more sense to understand 
factor analysis as such an instrument, with a function similar to a micro- 
scope, than to see it as a technique for testing or producing hypotheses in 
multivariate contexts. Factor anlaysis should be understood as a method of 
data analysis, not as a technique of inferential statistics. 

My claim is revisionary. Factor analysts traditionally present their 
method as a method of hypothesis generation. However, I believe this is a 
consequence of the concepts that have been available for describing science. 
The function of instruments in scientific work has been all but ignored 
by philosophers of science. As a consequence, attempts to understand 
techniques for gathering data have been constrained to a strait-jacket of 
concepts for describing hypothesis confirmation, refutation or generation. 
this distorts both the practice of, and the attempt to produce a philosophy 
of, data gathering. I 


332 Davis Baird i i 


Summaries and assumptions In order to summarise, principles of selection 
are necessary. These principles of selection are encoded in the ‘assumptions’ 
which are built into factor analysis. In microscopy it is ‘assumed’ that 
contiguous parts of a specimen that stain homogeneously are natural parts 
of a cell. But this is not so much an assumption as an understanding of the 
operating characteristics—and constraints—of the instrument. An homo- 
geneously stained structure may turn out to be an artifact. The common 
factors turned up by a factor analysis also may be artifacts (Thesis 3). 

Factor analysis can only pick out linear patterns in the correlation data. 
This does not mean that factor analysis is useless if one variable depends 
on another in a non-linear way. Data must be prepared to ‘show their stuff 
in a linear fashion’ (Thesis 5). To this end, a wide variety of ways to 
transform scales of measurement have been developed so that relationships 
which when viewed with one scale are not linear, in a transformed scale are 
linear. More than this, finding the right transformation is largely an iterative 
and interactive process of trial and error (Thesis 4). Indeed, factor analysts 
will change scales and even change input variables entirely in order to find 
persistent, potentially interpretable patterns in the data.’ 

The fact that important information may be lost in the process of pro- 
ducing a factor-analytic summary is not, by itself a criticism of factor 
analysis as an instrument. Similarly, the fact that the results of a factor 
analysis may represent no important or enduring piece of nature is not by 
itself a criticism of the factor-analytic insrument. 


Criticising factor analysis What would count as a valid criticism of factor 
analysis? In general, factor analysis would not be a useful tool—in a par- 
ticular empirical domain—if it were discovered, that the specific kinds of 
summaries provided by factor analysis were not useful for the specific 
domain in question. 

My concern here is neither to evaluate factor analysis as I think it should 
be evaluated, nor to report on current and past research along these lines. 
I am simply attempting to clarify what the appropriate kinds of desiderata 
are for instruments in general and for factor analysis in particular. 

The instrument should be consistent—that is to say the instrument 
should be reliable. This involves two kinds of things: given the same or 
similar inputs the instrument should produce the same or similar outputs; 
and there should be a fair degree of consensus among those trained to use 
the instrument as to what a given output represents. That is, the outputs 


Most factor studies found in the literature do not reveal the interactive of the use of factor 
analysis. Instead a final polished factor study is presented which presents the results that 
the researcher ultimately found, and found to be important. I am told that such reporting 
hides most of the iterative and interactive work that went into the study (Janarone, [1985]). 
Thurstone’s important work on the 8 primary mental abilities (Thurstone, [1947]), for 
example, required considerable reanalysis and test variable refinement before Thurstone 
published a final form of the study (Janarone [1985]). See also Thurstone [1948]. 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 333 


should be internally consistent—the instrument should not be capricious— 
and the outputs should be reasonably unambiguous. If the ‘principle of 
simple structure’, for example, was vague enough that a wide variety of 
different rotated solutions could be produced consistent with this principle, 
and if there were no further way of pruning the diversity of results, then 
factor analysis would lose on the score of reliability. 

Seen as an instrument, the process of interpreting the factors produced 
by a factor analysis is extrinsic to the analysis itself. There may be little 
consensus about how to understand or interpret the factors. But, there will 
be complete agreement about what the output is; no one will argue about 
what the specific linear equations are that are output, only about what they 
are telling us about the world. 

The instrument should also produce information that is related to the 
bit of nature under study in some important respect—that is to say the 
information should be walid. A variety of means for arguing for the validity 
of an instrument have been developed. One of the most convincing is 
to produce the same information—given ‘the same input—by a wholly 
independent means. For example, telescopes used for observing the heavens 
could be tested by directing them toward objects on the earth. Direct 
inspection at ten feet and telescopic inspection at 1000 feet are independent 
means to look at a given object. If direct observation is not possible—as it 
is not with microscopes—then some other instrument whose principle of 
operation is different should be used to corroborate the results. Hacking’s 
study of dense bodies with both an electron and an optical microscope is 
one such example. 

Such validity arguments are never fully conclusive, of course. It may be 
that telescopes work fine on the earth, but are completely distorting in the 
heavens. It may be that factor analysis performs well with spatial geometry, 
but is completely distorting in psychological territory. 

Other kinds of validity arguments are whether the results of a factor 
study can be generalised (If the same or similar factors turn up in a different 
population of correlations obtained from a different population of indi- 
viduals, our confidence in the reality of these factors increases.) and whether 
the results suggest other ways to experiment with the bit of nature in 
question (Information output by an instrument should fit in with, and help 
promote, an expanding theoretical understanding of the bit of nature in 
question.). In short, a new instrument should spawn new theoretical and 
experimental studies in a variety of different populations. 

Validity is a contingent matter. It is not something to be determined 
a priori. X-rays, as a contingent matter of fact, happen to reveal useful 
information to dentists. But our teeth could have been made of different 
stuff. Similarly, it is a contingent matter whether or not the common 
factors revealed by factor analysis are useful in any particular field of study. 
Common factors may be artifacts in one field and revealing in another. 
Since this is an empirical matter, it is something to be settled case by case 


334 Davis Baird 


in the individual disciplines that use factor analysis. This is not to say that 
there is little value in establishing the operating characteristics of factor 
analysis overall, regardless of substantive application. This is only to say 
that the usefulness of the instrument, with whatever operating charac- 
teristics are established for factor analysis, is a matter to be determined at 
the level of application, not of instrumentation. 


The objects of factor analytic study Microscopes reveal information about 
the very small, telescopes about the very distant; what does factor analysis 
reveal to us? The equations that result from a factor analysis describe the 
population of individual test measurements in terms of linear combinations 
of a number of constructed common factors. Thus, factor analysis reveals 
information about linear common-factor structure in populations of cor- 
related test measurements. 

A question of greater interest is: Why is this information interesting 
or important? This is a question about the validity of the instrument. 
Researchers have argued that the common factors that result from a factor 
analysis represent some natural property of the individuals, measurements 
of which, produce the population of correlated measurements being 
analysed. Other researchers, who espouse less realism in their philosophy, 
have argued that the common factor equations provide a means for econ- 
omically describing the population of test measurements. Somewhat of a 
middle ground is taken by those who urge that the common factor equations 
provide an accurate and economical means for predicting individual or 
group performance on future batteries of test measurements. 

Understanding factor analysis as a method for discovering common factor 
structure in populations of correlated test measurements, and not as a 
method for discovering some incorrigible or highly plausible hypothesis 
about the source of these measurements, has the advantage of being neutral 
with regard to questions about the interpretation and importance of the 
information factor analysis reveals. Questions about realism with regard to 
the information revealed by factor analysis, and about the discovery of this 
information by factor analysis in the first place, are thus separable and the 
answers can be accordingly more sensitive to the variety of uses factor 
analysis is put to. The instrument may be useful in different ways in 
different contexts to different researchers. 


6 FACTOR ANALYSIS, INSTRUMENTS AND THE LOGIC OF 
DISCOVERY 


It now is possible to answer the two problems I set at the beginning of the 
essay: (1) What should the status of modern exploratory factor analysis 
be in light of the criticisms that have been brought against it? (2) What 
philosophical problems should a revived logic of discovery—at least as it 
appears in exploratory factor analysis—hope to answer? 


Exploratory Factor Analysis, Instruments and the Logic of Discovery 335 


The status of exploratory factor analysts When understood as a method of 
hypothesis generation, refutation or confirmation, we seemingly are caught 
in a dilemma. Neither of the following positions is satisfactory: 


(A) Exploratory factor analysis produces justified, incorrigible truths. 

(B) Exploratory factor analysis is merely one among many possible 
psychological aids to discovering hypotheses which must then be 
tested and confirmed or refuted. 


My opinion, in contrast, is that exploratory factor analysis does not produce 
hypotheses or truths; it produces data. This data—the common-factor 
structure discerned in a population of correlations—may then be used as 
evidence to confirm, refute or even suggest some hypothesis. It may, 
however, just be data which bears no relation to any hypothesis of current 
concern. Indeed, it may be useless data. How the data is used, however, is 
not a matter for factor analysis. Seen this way, we can appreciate what is 
right and what is wrong with both positions (A) and (B). 

The data produced by a factor study may be useful, but there is no 
guarantee. No doubt, it is best in most situations to keep an open mind and 
look for all different kinds of structures in populations. Factor analysis 
provides information about one kind of structure; but other kinds of struc- 
ture may turn out to be more useful. Whether the data produced by a factor 
analysis is used depends on the situation in which the data was generated: 
Is there reason to think that the common factor structure represents some 
important feature of the bit of nature under investigation? That is, Is it 
not an artifact? Is there some theory, which the relevant investigators 
are interested in, which the data could confirm or refute? Is there some 
application which such data could be used for? If none of these situations 
obtain, likely the data will be ignored; it is simply not interesting. 

However, we cannot simply discard or reject as false the results of a factor 
analysis. We can ignore the data; we can regard the data as uninformative 
about the bit of nature we are interested in. But this data is not false for it 
is not the kind of thing that can be false. The common factor structure 
exposed by a factor analysis exists in the correlation matrix. It may not 
represent any important feature of the bit of nature under investigation and 
thus may not be interesting; but it is not false, although hypotheses about 
it my well be false. In this sense, the result of a factor analysis is incorrigible.’ 

Exploratory factor analysis provides incorrigible data which could be 
evidence to test or suggest corrigible hypotheses. Positions (A) and (B) are 
both on the right track. They both go wrong when we think of the result 


' There are, of course, difficult problems with errors of measurement; a consequence of which 
is that data may be false—in a sense. In the same spirit with which I have avoided the issues 
of statistical inference and assumed that the correlations analysed by factor analysis are true 
population correlations, I also assume we can treat separately issues about the inevitable 
errors of measurement. 


336 Davis Baird 


of factor analysis as an hypothesis. It is not; it is data which factor analysis, 
as a part of the logic of discovery, helped us to discover. 


Instruments and discovery It has been common in philosophical dis- 
cussions of discovery to regard hypotheses as paradigms of things we hope 
to discover. Laudan characterises the abandonment of the logic of discovery 
in terms of the changing context of justifying hypotheses. Perhaps my most 
central claim is that factor analysis does not produce hypotheses. So, it 
might seem that factor analysis could not be part of a logic of discovery. I 
wish to close urging that this presupposes too narrow a view about 
discovery. Factor analysis in particular, and all information-transforming 
instruments in general, are tools of discovery. 

We discovered chromosomes with microscopes. At the beginning chromo- 
somes were simply dark structures seen under the microscope. Numerous 
hypotheses were suggested: Are these structures artifactual irregulari- 
ties of the process of staining? Do these structures have an important 
role in cell function? Answers to these questions establish the interest and 
importance of the discovery. Answers also refine and clarify what has been 
discovered. There is a tendency to say we did not discover something until 
we can answer these questions. On such a view, then, it is easier to say we 
discovered that chromosomes are an important part of a cell; that they divide 
when the cell divides, etc. This makes discovery, discovery of hypotheses. 

While all of this is important, describing the whole of discovery this way 
short changes the instrument with which the discovery was made in the 
first place. Dark structures did not simply appear. We had to find them 
with the use of microscopes and staining techniques. Thus, one problem 
the logic of discovery should investigate is how and why information-trans- 
forming instruments can produce such discoveries. 

Instruments produce data which may or may not be interesting and 
important. Data which is interesting is usually cited as a discovery. The 
title may later be revoked, if further study shows that the data was mis- 
understood and that it is less interesting than it was originally thought to 
be. Interest in the data may be theoretically motivated. The data might be 
useful as evidence to confirm, refute or experimentally elaborate some 
theory of interest. But, interest need not be theoretically motivated: the 
data may indicate an hitherto unsuspected but striking phenomenon. No 
doubt there are many other reasons data may be interesting; elaborating 
such reasons should be one project for the study of discovery I have in 
mind. 

I do not suppose there are many revealing truisms about how to use 
instruments to make discoveries. There is no guarantee that a discovery 
will result if a given instrument is used in a given context; this is a sense in 
which there is no logic of discovery. There are some not-particularly reveal- 
ing truisms: (1) You will not discover anything with an instrument unless 
you use the instrument; (2) success can spawn further success—making one 


Exploratory Factor Analysts, Instruments and the Logic of Discovery 337 


discovery about cells with a microscope suggests further microscopic study; 
(3) sometimes theory suggests that use of a given kind of instrument would 
be revealing—quantum theory suggests the use of yet more powerful 
accelerators. 

The main moral we might draw about instrument-prompted discoveries 
is that information must be expressed in a certain manner to be useful 
to human discoverers. Humans would have had a hard time discovering 
chromosomes without the help of microscopes to re-express the information 
in a cell in a manner useful for humans. The raison d’être of factor analysis 
and any such information-transforming instrument, is to make nature ‘show 
off’ for humans. 


Department of Philosophy, University of South Carolina, 
Columbia, SC 29208 


REFERENCES 


GOULD, S. J. [1981]: The Mismeasure of Man. First edition. New York: W. W. Norton & Co. 

HACKING, I. [1983]: Representing and Intervening. First edition. Cambridge University Press. 

HEARNSHAW, L. S. [1964]: A Short History of British Psychology. Firat edition, New York: 
Barnes and Noble. . 

JANARONE, R. [1985]: Private conversation. 

JorEsKOG, K. [1966]: ‘UMFLA—A computer program for unrestricted maximum likelihood 
factor analysis’, Research Memorandum 66-20. Princeton, New Jersey: Educational Test- 
ing Service. 

JoresKkoc, K. [1967]: ‘Some contributions to maximum likelihood factor analysis’, Psycho- 
metrika, 32, pp. 443-82. 

JORESKOG, K. [1969]: ‘A general approach to confirmatory maximum likelihood factor analy- 
sis’, Psychometrika, 34, pp. 183-202. Reprinted in Joreskog and Sorbom [1979]. Page 
references are to that volume. 

JORESKOG, K. and LAWLEY, D. N. [1968]: ‘New methods in maximum likelihood factor 
analysis’, British Journal of Mathematical and Statistical Psychology, a1, pp. 85-96. 
JORESKOG, K. and Sorsom, D. [1979]: Advances in Factor Analysts and Structural Equation 

Models. First edition. Cambridge, Massachusetts: Abt Books. 

LAUDAN, L. [1981]: Science and Hypothesis. Firat edition. D. Reidel. 

Lone, J. S. [1983]: Confirmatory Factor Analysis. First printing. Beverly Hills, California: 
Sage Publications. 

MULAIK, S. [1985a]: ‘Confirmatory Factor Analysis’. In R. B. Cattell and J. Nesselroade, 
editors, Handbook of Multivariate Experimental Psychology. Second edition. In press. 
Page references are to an informally circulated pre-publication typescript. 

MUuLAIK, S. [1985b]: ‘Exploratory Statistics and Empiricism’, Philosophy of Science, 52, pp. 
410-30. 

PORTUGAL, F. H. and Couen, J. S. [1977]: A Century of DNA. Fourth printing, 1979. 
Cambridge, Massachusetts: M.1.T. Press. 

SPEARMAN, C. S. [1927]: Abilities of Man. First edition. New York: Macmillan. 

THURSTONE, L. L. [1947]: Multiple Factor Analysis. Eighth impression, 1969. University of 
Chicago Press. 

THURSTONE, L. L. [1948]: ‘Psychological implications of factor analysis’, American Psycholo- 
gist, 3, pp. 402-8. 

UNITED STATES BUREAU OF THE CENSUS [1975]: Historical Statistics of the United States, 
Colonial Times to 1970, Bicentennial Edition, Part 1. Washington, D.C. 


Brit. J. Phil. Sci. 38 (1987), 339-346 Printed in Great Britain 339 


Is Determinism a Vacuous Doctrine? 


by GEORGE N. SCHLESINGER 


Russell’s Thesis ° 
Do Only Simple Statements Qualify As Laws? 
Determinism and Causality 

Determinism and Cyclicity 

What ts a Law of Nature? 

The Correct Way of Stating Laplacean Determinism 


Aum & WN A 


I RUSSELL’S THESIS 


The question as to whether the world is strictly deterministic has con- 
tinually been of the highest interest to all thinking people. Some have 
insisted that future scientific developments are going to vindicate their 
belief in complete Laplacean determinism; others have argued that in the 
light of the results of contemporary physics, full determinism on a micro- 
scopic level can no longer be held to be true, while yet others have gone 
further and maintained that the universe is replete with random events on 
any level. 

As we know, however, Russell has argued that regardless what our uni- 
verse is actually like, it is quite a trivial task to construct functions that will 
describe the behaviour of all the physical systems we have observed. All we 
need to do is compile a list of the corresponding values of any two variables 
of a given system and this will automatically furnish a mathematical func- 
tion—which of course is likely to be rather complicated containing as many 
terms as the results to be accounted for, but which will be satisfied by all 
our data (Russell [1963]). 

Are we thus to conclude that the doctrine of determinism, which does 
not rule out any conceivable state of affairs, is vacuous and therefore that 
the hundreds of papers debating its status are pointless? Or are we justified 
in believing that the questions of determinism is a substantial one, for we 
may—as apparently have hundreds of people who have written on the 
subject—simply ignore Russell on the assumption that he is wrong? 


2 DO ONLY SIMPLE STATEMENTS QUALIFY AS LAWS? 


The suggestion that springs easiest to mind as to how we may best meet 
Russell’s challenge is that not all mathematical functions, but only fairly 
simple ones, qualify to represent authentic laws of nature. A universe in 


340 George N. Schlesinger 


which there are regularities that are too complex to serve as practical means 
for predictions is one that contains regularities that do not quality as genuine 
laws and thus it is an indeterministic universe. 

Hempel is one philosopher to have entertained such an idea. He claims 
that a scientist who regards determinism to be a substantial doctrine will 
count as laws only statements that are 


... of a simple kind to permit of discovery and subsequent predictive application 
by human beings (in Hook [1958]). 


E. Nagel, in his classic The Structure of Science, has also maintained that a 
statement beyond a certain complexity is not fit to be considered as a law 
of nature. 

Now there may be nothing wrong in wishing, in hoping or even in 
strongly believing that nature and nature’s laws are fairly simple. Is it, 
however, reasonable to insist that they must be simple, since it is necessarily 
true that there are no complicated laws, for anything complicated can ipso 
facto not be a genuine law of nature? We might also ask: if humans were 
only half as intelligent as they actually are and thus many regularities that 
are presently treated as laws would be inaccessible to them, would these 
relatively difficult laws lose their lawful status? If so, are we then not 
committed to the view that the question whether a given set of phenomena 
are governed by law or not, essentially depends on how smart its inves- 
tigators are? 

In addition, there is little doubt that under the appropriate circumstances 
philosophers like Hempel and Nagel would themselves renounce the prin- 
ciple disqualifying unusually complex statements from serving as laws of 
nature. Suppose that in some future time with the aid of supercomputers 
we acquire the ability to handle skilfully complexities far beyond our current 
reach. Suppose also these highly advanced computers produce an 
expression C which is many thousand times more intricate than anything 
we employ today in science. We find that employing C enables us to predice 
a vast variety of events and also unify a peerlessly large set of otherwise 
disparate laws. Surely no scientist nor any philosopher would advocate the 
throwing away of such advantages. Given the unprecedented usefulness 
of C and that mathematical abstruseness no longer inhibit us, all 
would embrace C as an authentic representation of a fundamental law of 
nature. 


3 DETERMINISM AND CAUSALITY 


Another possibility is to try and define determinism with the aid of the 
causality principle. Most philosophers tend to agree that one of the major 
implications of strict determinism is that a particular which has not changed 
any of its properties between times t and ¢’ and is surrounded by precisely 
identical circumstances at these times, will behave in exactly the same way 


Is Determinism a Vacuous Doctrine? 341 


at t and t’. Conversely of course, in case the particular in question does not 
behave identically on those two occasions, then strict determinism is false. 
A rigorous formulation of determinism along these lines has recently been 
attempted by Ingemar Nordin in his [1982]. The essential core of his 
suggestion has of course been entertained before, namely, that determinism 
amounts to the conjunction of the two classical principles (i) Every event 
has a cause, (ii) Like cause like effect. However, the term ‘like’ needs to be 
explicated and the highly controversial term ‘cause’ should, if possible, be 
replaced by something more straightforward. Nordin ends up by stating 
that two events are of the same type if, and only if they are exactly alike, 
except for where and when they occur, and in light of this definition he 
reformulates (i) and (ii): (a) For every event b; there exists an event a; such 
that a, causes b;. (b) If an individual event a causes an individual event b 
then whenever an event of type A occurs in the circumstances as A, it causes 
an event of type B. a; and b; are events of type A and B respectively, and 
contends the world to be deterministic if and only if (a) and (b). Sub- 
sequently he eliminates the term ‘cause’ by substituting an equivalent for 
constant conjunction. The details are not important for our purposes. 

Nordin calls his a ‘fairly uncontroversial formulation’ and on the surface 
it does look rather innocuous. Let us for a moment assume that the 
traditional, Laplacean definition of determinism is inadequate because of 
Russell’s thesis, while this attempt to offer a definition that makes use of 
the causality principle, is successful. We may then conclude that it is not 
necessary to think of all events as stricly preordained just because they 
are predictable on the basis of the laws of nature. What does ensure the 
necessitation of every occurrence is their complete interconnectedness, that 
is, the fact that there exist no loose, free floating happenings. The causality 
principle guarantees that any given type of event has an unbreakable link 
to what is its effect and thus the occurrence of the former compels the 
occurrence of the latter. In a deterministic universe all phenomena are 
tightly interwoven; each event occurs inevitably because of its link to the 
great chain of events that constitute the history of the universe. 

On a closer look, however, it becomes evident that the Russellian objec- 
tion that the doctrine is trivially satisfied and is therefore devoid of interest, 
applies to Nordin’s version no less than to the first version. For suppose, 
for instance, that an object O is in identical states at tand f’, and is irradiated 
in precisely the same way at both times, yet at t it reflects 90 per cent while 
at ť it reflects only 70 per cent of the radiation impinging on its surface. It 
would be too hasty to conclude that this would really present us with an 
instance where the principle of causality has definitely been violated. As 
long as it is not the case that the entire universe was in perfectly identical 
states tand t, O’s different behaviour at those two points in time can always 
be attributed to different causes acting on it. For if there exists but a single 
atom anywhere in the universe that has moved slightly between tand f’, 
then the decrease in O’s reflective index could have well been caused by it 


342 George N. Schlesinger 


being further away from at f than at £. In short, since we do not wish to 
exclude the possibility of action at a distance, then as long as in our universe 
the same total state is never duplicated we cannot be presented with a 
definite case where two different events must have had identical kind of 
causes. Nothing can therefore happen that would incontestably be a viola- 
tion of the causality principle and therefore, as Russell said, the universe 
may be claimed to be necessarily deterministic. 


4 DETERMINISIM AND CYCLICITY 


Seeing that what appeared to be the two major routes most likely to lead to 
a solution of our problem have been blocked, one might be attracted to 
trying a somewhat out of the way approach. We might attempt to restore 
substance to the doctrine of determinism by associating it with a cyclic 
universe. Thus we could suggest that the doctrine of determinism implies 
that if the total state of the universe was ever identical with a state that had 
obtained at any time in its past, the universe would from then onward go 
through eternally recurring cycles. For if the universe is deterministic, then 
once the total state Sọ has been followed by S4, S2,...S,, S,, this is bound 
to happen again and again. 

It may well appear that this approach provides a clear definition of an 
indeterministic universe: if Sọ comes around for a second time, and instead 
of being followed by S;, as it was on the first occasion, it is now 
followed by S# On a closer look, however, it seems that an objection might 
be raised. It could be claimed that the very fact that on the second occasion 
S¥ followed the so-called Sọ, shows that we did not actually have the self- 
same state repeated again on this occasion. After all the answer to the 
question, what kind of elements the universe is made up of, depends on the 
dispositional properties of those elements. Given that on the first occasion 
they generated S, while on the second they produced Sf, we have decisive 
evidence of the different properties these elements exemplify and that the 
whole universe just could not have been in identical states in those two 
instances. 

Let us however ignore all the problems there may be with the idea of a 
cyclic universe and let us assume that certain cyclic universes are definitely 
deterministic and others are unquestionably indeterministic. Even with this 
concession, the present approach can be of not much help to us. We have 
still been given nothing whereby we could distinguish between the various 
members of the infinitely vast class of universes in which the same state 
never occurs more than once. Surely it would be absurd to insist that 
even a universe in which everything proves to be precisely predictable 
with the aid of a small set of very simple laws was indeterminable just 
because no state occurred in it more than once. Thus it would be un- 


_teasonable to regard all non-repeating universes, without exception, to be 
_ indeterministic. 


Is Determinism a Vacuous Doctrine? 343 


On the other hand, the option that all such universes are deterministic 
does not appear attractive either. If the doctrine of determinism rules out 
nothing except a specific type of repeating universe, but otherwise permits 
any degree of chaos; if it is so trivially satisfiable as to be compatible with 
such erratically volatile worlds in which there are no stable physical objects 
at all and unprecedented events keep occurring every moment, then it is a 
doctrine devoid of any real substance. 

Thus the reasonable thing to say should be that some non-repeating 
universes are deterministic while others are not, and this puts us back right 
from where we began. 


§ WHAT IS A LAW OF NATURE? 


I believe it is possible to offer a Laplacean formulation of determinism 
provided we place appropriate restrictions on what may qualify as a law 
of nature. Let us remind ourselves that independently of the issue of 
determinism the question what is a law of nature has for a very long 
time been one of the central problems of philosophy. A large number of 
philosophers have been grappling with the problem how to offer a general 
description of a law of nature and in particular how to distinguish one from 
an accidental generalisation. 

I shall mention here briefly that probably the best known suggestion is 
that when (x)(Px > Qx) is a full-fledged law of nature, and not a mere 
accidental generalisation, it supports the counterfactual ‘If this particular 
exemplified P, it would also exemplify Q’. Various philsophers have raised 
objections by citing counterexamples. I believe however, that there is a 
much more radical objection and that even if the distinction just mentioned 
holds without exception it is unfortunatrely of no use whatever. For suppose 
we are confronted by two statements S and S’, and all we knew that one of 
them does and that the other does not express a law of nature. What are we 
supposed to do in order to find out which is which? Surely the principle 
that law-statements do, while others do not, support their counterfactuals 
can be of no help to us, since it is not the case that if we should watch S 
and S’ carefully enough we shall observe one of them in the act of supporting 
its counterfactual. Thus prior to our knowledge of the status of S and S, 
we need some other clue that may be recognised by one who is ignorant of 
their status and which is a reliable indicator whether S or S” is a law of 
nature and hence whether S or S is to be assumed to be supporting its 
counterfactual. 

The same kind of objection applies to some other suggestions. David 
Armstrong [1983], for example, recently devoted a whole book to the prob- 
lem, ending up with the suggestion that a law of nature does not merely 
assert that (x)(Px > Qx), but in addition it also claims that there is a nomic 
relation between the universals represented by ‘P’ and ‘Q’. Once more no 
clue is provided as to how to distinguish cases where only the uniy: 





344 George N. Schlesinger 


generalisation holds from those where in addition the universals involved 
are nomically related. 


6 THE CORRECT WAY OF STATING LAPLACEAN DETERMINISM 


T should like to advance the thesis that a true statement does not make the 
grade as a law of nature by merely giving words to a sequence of events. 
The description of a haphazard series of occurrences does not amount to a 
genuine law. Essentially a law of nature is a description of recurring regu- 
larities; it depicts the orderly repetition of patterns. It follows therefore 
that when L is a genuine law then a great number of subsets of the data we 
have, also forms a sequence that require the postulation of L. 

In order to shed light on the notion of a variable-pair generating a 
sequence exhibiting the regular reappearance of a given motif, let us contrast 
the situation obtaining in the context of what we believe to be a lawfully 
correlated physical parameter pair, and the situation we face when dealing 
with unrelated parameters. Suppose, for instance, we observe the distance 
covered by a freely falling object near the surface of the earth at every tenth 
of a second, and obtain a 100 results. There will of course be infinitely 
many expressions which all the data will satisfy. However, Galileo’s law, 
(G): s = 1/agt, may be said to be ‘uniquely exhibited’ by these, meaning, 
that of all the expressions that fit each of the results, (G) is the simplest. Or 
‘uniquely exhibited’ may be taken to mean that (G) will universally be 
regarded as the hypothesis best accounting for all the results. Now let us 
compile, say, 25 lists, each consisting of 10 different combinations picked 
from the complete list of 100 results. We shall find that the group of events 
represented by each list also uniquely exhibit (G), that is, the simplest 
equation satisfied by the respective ten results is none other than (G). Thus 
we may say that a significant feature is repeatedly displayed by each of the 
25 groups: every one of them uniquely exhibits expression (G). We may 
add, that should we obtain a new result, we shall find that our increased 
collection of 101 results will still satisfy (G). In general, if n data are found 
uniquely exhibiting a certain expression and these are lawfully correlated, 
then beyond a certain magnitude of n, we shall find n+1, n+2,... etc., 
repeatedly exhibiting uniquely the same expression. 

Consider, on the other hand, some arbitrary parameter pair x and y 
which are not lawfully varying with one another and imagine we have 
obtained 100 different values of y corresponding to that of x. Once more 
there are bound to be infinitely many expressions that will be satisfied by 
each one of the results, however, even the simplest of those is likely to be 
considerably complex. Let (Z): y =f (x) be the least complex of all the 
expressions that fit the 100 results (and thus regard (Z) to be uniquely 
exhibited by the data). Then first of all, we shall find that if we obtain an 
additional result, the 101 data will in general no longer fit (Z), and in order 
to accommodate all the results, a more complex expression will have to 


Is Determinism a Vacuous Doctrine? 345 


be employed. Secondly, let us compile once more 25 different lists each 
containing a different combination of results picked from the original list 
of one hundred. We are likely to find that the groups of values do not 
uniquely exhibit (Z), since for each group there exists simpler equations 
than (Z) satisfied by all ten members. Consequently, the sequence of values 
obtained for x and y, unlike the one obtained for distance and time for 
freely falling objects, cannot be said to exhibit a recurrence of pattern in 
the required sense. The former repeatedly exhibits in a unique fashion, 
neither (Z) nor any other mathematical expression, simple or complex. 

Now that we have contrasted the behaviour of law-like variable pairs 
with that of law-less ones in terms of the hypotheses different sets of 
observations would be taken to support, it becomes evident that we may 
also characterise a genuine law of nature as one which is possible to establish 
by the inductive method. By saying this we do not in any way make the 
nomic status of expressions dependent on the capacities and skills of those 
who happen to be investigating nature. Our characterisation is compatible 
with the existence of laws of such immense complexity as to be placed 
forever beyond the reach of human grasp. However, as long as different 
subsets of sufficiently large sets of data uniquely display the same regularity, 
it follows, that by positing the best hypothesis concerning the rule the 
results already obtained obey, plus the assumption that the unobserved 
resembles the observed, it is in principle possible to anticipate the shape of 
things to come. Thus as long as sequences of events exhibit recurring 
patterns in the sense we have explained, then from an objective point of 
view, the law governing those events lends itself to discovery by the induc- 
tive method, (and by the inductive method only).' 

We may thus define the doctrine of universal determinism with the aid 
of a Laplacean Demon. We may say that for a Demon who had a full 
knowledge of the momentary conditions prevailing throughout the universe 
and knew all the laws of nature 


... to it nothing would be uncertain, the future as well as the past would be present 
to his eyes (Laplace [1820]). 


It is crucially important to realise that though the Demon may be thought 
of as endowed with infinite intelligence we need not ascribe to it prophetic 
vision or any supernatural powers. Thus its knowledge of all the laws of 
nature is not to be thought of as having been acquired by some mysterious 
revelation or through extra-sensory perception, but by the rational methods 
of science. The Demon is supposed to have made a sufficient number 
of observations to have discerned the patterns exhibited by the various 
phenomena and thus be able to infer inductively the laws operating behind 


1 On the other hand, for instance, a generalisation like that discussed by J. L. Mackie, 
‘Everyone in this room speaks English’ is, in spite of its exceeding simplicity, a mere 
accidental generalisation. It amounts to no law of nature, since it is not established by 
induction. Its truth may be fully determined by counting all its instances. 


Brit. J. Phil. Set. 38 (1987), 347-380 Printed in Great Britain 347 


On the Alleged Equivalence Between 


Newtonian and Relativistic Cosmology 
by PIERRE KERSZBERG 


Among the many controversial contributions of E. A. Milne to cosmology, the only 
one which is taken seriously today (to the extent that it has been absorbed as a 
premise in most scientific approaches to the problem of the universe as a totality) 
is his early suggestion that a formal equivalence may be made between Newtonian 
and Relativistic cosmology. My own paper suggests that, over and above any logical 
validity in the alleged equivalence, the actual way in which it has subtly insinuated 
itself into nearly all contemporary speculation reveals as few things could some of the 
epistemological foundations that underlie the current assumptions in cosmological 
science. 


Milne’s Neo-Newtontan Cosmology 

Etnstein’s Three Arguments against Newtonian Cosmology 
Weyl’s First Grapplings with Cosmology 

Weyl’s Principle and the Revival of Newtonian Cosmology 
The Milne-McCrea Contribution of 1934 

Layzer’s Criticism and McCrea’s Response 

More Recent Developments 

The Mathematics Versus the Physics of the Equivalence 
Conclusion 


© ONY AWM RW DN M 


I MILNE’S NEO-NEWTONIAN COSMOLOGY 


One of the most striking paradoxes of twentieth-century cosmology was 
articulated in 1934, when both E. A. Milne and W. H. McCrea showed 
that the relativistic laws controlling the universe could be derived in the 
simplest possible way from no less a source than Newton’s theory. Milne, 
in particular, was keen to demonstrate that ‘an analyst of Newton’s 
period . . . would have secured all the results yet capable of observational 
test’ ([1934], pp. 71-2). Such an analyst would have been led to predict, on 
theoretical grounds alone, a velocity-distance proportionality very similar 
to Hubble’s law and indeed compatible with the entire picture of a non- 
static universe (whether expanding or contracting). Even though Milne’s 
remark was purely polemical, its validity is widely accepted today and we 
can speak of such a reconstruction in terms of a ‘neo-Newtonian’ model of 
the universe (the expression is that of J. North [1965], pp. 180-5). This 
result seems to have been endorsed unanimously by the scientific 
community. For instance, Sciama ([1973], p. 114) puts forward what he 
calls the Milne~McCrea theorem, which allows us to see the scale factor R(t) 


348 Pierre Kerszberg 


as satisfying the same equation in both the Newtonian and the relativistic 
theories, at least in the case where uniform models with negligible pressure 
are being contemplated. Given these conditions, the history of the universe 
that may be predicted by the behaviour of the scale factor as a function of 
time is identical in both theories. Most textbooks on cosmology begin the 
topic of cosmic dynamics with an examination of pseudo-Newtonian 
results, the assumption being that “since we understand the Newtonian 
equations and models more intuitively, they help us understand the general 
relativity models and equations, which arise in a much more abstract way’ 
(Rindler [1977], p. 223). 

The equivalence has been developed in the following way. Milne’s pivotal 
idea is that a smoothed out universe can be represented as an expanding 
cosmic ball. This is a ball of matter expanding in empty Euclidean space 
([1934], pp. 67 fi); the density of this ball is uniform and consists of 
particles in a state of free fall within the gravitational field produced by the 
ball. Let M(r) be the mass contained in the sphere of radius r, and v the 
escape velocity from that mass. The minimum velocity that will enable a 
particle to reach infinity is the parabolic velocity of escape: 


(1/2)v? = GM(r)/r (1) 


where G is the constant of gravitation. As Milne proceeded to show in his 
joint paper with McCrea, it is clear that a velocity greater than the parabolic 
velocity of escape defines hyperbolic paths, while a velocity less than that 
of the parabolic velocity defines elliptical paths. The latter instance is the 
only one that predicts a collapse of the cosmic ball. Now, equation (1) also 
represents the fact that the total energy of the system, kinetic (v*/2) and 
gravitational (—GM/r), remains constant if an appropriate constant is 
added to the right-hand side. The value of this postulated constant will 
now determine the behaviour of the cosmic ball: a positive or negative value 
means that the particles will follow hyperbolic or elliptical paths, while a 
zero value will result in a parabolic case. By recourse to the Hubble law of 
velocity-distance proportionality and the scale factor R, Milne was able to 
show that the most general situation is defined by 


R = (811GpR*)/3—k (2) 


where p is the density and k the decisive constant. This is nothing other 
than the relativistic Friedmann equation, which was established as early as 
1922 by Friedmann and subsequently rediscovered in the early 19308, by 
which time it was apparent that a non-static metric provided a solution to 
the original Einstein—de Sitter controversy (see Friedmann [1922], as well 
as Hetherington [1973] and Torretti [1983], p. 284 for some conjectures on 
the reason for the delayed responses to early suggestions of a dynamic 
universe). When the value of R is appropriately adjusted, it can be shown 
that the values — 1, 0, +1 for k correspond to the hyperbolic, parabolic and 
elliptical cases. Of course, the interpretation of & differs in each theory. 


Equivalence Between Newtonian and Relativistic Cosmology 349 


While the Newtonian theory interprets it in terms of the total energy of the 
particles, the relativistic models take it as a specification of the curvature of 
space. Nevertheless, as far as the prediction of the overall history of the 
universe is concerned, the equivalence seems to be total. 

The Hubble law appears to be the key to the equivalence. Milne saw this 
very clearly, since he managed, in his first book (which followed soon after 
the publication of the equivalence paper), to deduce the law from purely 
kinematic arguments ([1935], pp. 79-80). His demonstration is most 
remarkable and it may be reconstructed in the following way (see further 
details in North [1965], p. 160). Two viewpoints O and O’ necessarily lead 
to the same picture in a uniform universe. Imagine some event or object 
vectorially related to O and O’ by r and r’. Let a be the vector for O to O’. 
We have r = a+r’. If v’ is the velocity seen at O’, then 


u(r’) = v (r—a) = v(r)—v (a). 


Because of the uniformity, the function v’ must depend on its argument r’ 
precisely as the function v depends upon its argument r. It follows that 


v (r’) = v(r—a) = v(r)—v(a) 
for all r and a. Thus the function v must be a linear vector function of its 
argument. Further, isotropy requires the multiplier of r to be a scalar, let 
us say H. This scalar may, of course, be dependent upon the epoch of the 
universe, the time t. Finally, we have 


vir) = H(ðr. (3) 


This is the Hubble law, which Hubble discovered on observational grounds 
in 1929 (see Hubble [1929]). Soon after, Robertson and Walker established 
the form of the famous Robertson-Walker metric, which indicates the 
form of all elementary intervals which are compatible with the Friedmann 
equation and the assumptions of homogeneity and isotropy. This is: 


T R?(t)dr? 
G +k? /473)* 


Note that the theory of gravitation is of relevance only in the determination 
of R(t); the geometry, k, is fixed quite independently. The problem is to 
find a dynamic theory of the function R(t). In relativistic cosmology, the 
equations of Einstein’s gravitational field theory are applied here, but pre- 
dictions as to the large-scale correlations of the universe are identical with 
those of the scalar H(t) in equation (3) when only the concept of Newtonian 
‘absolute’ time and Euclidean geometry for vector addition are brought 
into play. 

So much for the theory itself. When the equivalence is exploited as part 
of an introduction to the study of relativistic cosmology (something which 
is done in almost every contemporary textbook), it becomes apparent that 
such an approach is not particularly penetrating. Yet, this method disguises 


dê = ¢*dt? (4) 


350 Pierre Kerszberg 


its own ultimate significance as long as the very possibility of the alleged 
equivalence fails to be recognised as a problem which, in fact, transcends 
the power of scientific analysis. To contemplate this quandary in terms of 
conventionalism can only result in uncritical acquiescence. In view of the 
extremely successful application of the equivalence theme, one could argue 
that, precisely because the alternative theoretical descriptions are equally 
correct, they are therefore not really incompatible. This would leave us, 
admittedly, with the problem of how to single out the distinguishable but 
equally valid ‘definitions’ operative in each procedure. This conventionalist 
way out is tempting enough, since Milne himself believed that the funda- 
mental principles of cosmology ought to be referred to the actual aspects 
and motions of the universe rather than to the ‘laws of nature’ (see Bondi 
[1960], p. 124). From his point of view, this forms the basis of a theory, 
quite separate from either the relativistic or the Newtonian, what has come 
to be known as kinematic relativity. In other words, Milne’s construction 
of the equivalence was aimed at neutralising both rival theories. Theoretical 
underdetermination is the very essence of his reasoning, in the sense 
that in compatible (but local) theories will never fail to be in agreement a 
propos of large-scale observations. For Milne, all that remained was some 
demonstration that kinematic relativity was the sole theory which could 
fully express the necessary conventionalist position involved somewhere 
along the line in any theory of the whole universe ([1935], pp. 8 ff.). 

Now, it is all the more significant that recently, the problem of equiva- 
lence has received attention from someone who is a cosmologist rather 
than an historian or a philosopher. Indeed, Edward Harrison, in his brilliant 
textbook published in 1981 (see p. 283), expresses his puzzlement at this: 
‘In all its applications, Newtonian theory is only approximately true, and 
yet in this most unlikely of all instances it yields the correct answer.’ In 
short, the real problem is: Why does some form of (mildly) modified 
cosmology yield the right answer? Or, to be more precise: Why did we have 
to wait for general relativity before a consistent Newtonian cosmology could 
emerge? The problem had already been given a decisive impetus when, 
some twenty years ago, Schiicking managed ‘to discuss the question why 
such a simple and beautiful theory as Newtonian cosmology was not already 
formulated centuries ago’ ([1967a], p. 270). Yet, both Harrison and Schück- 
ing turn to a scientific solution of the problem, which tends to obscure the 
philosophical crux of the matter. 

Problems and even outright contradictions occur in many different ways 
within the framework of strict Newtonianism. To be sure, the untversal 
scope of the law of gravitation is far from being associated with any definite, 
concrete model of the universe. (For historical details on how Newton 
himself faced the problem, see Kerszberg [1986].) Let us suppose that the 
universe is a uniform, infinite distribution of matter in Euclidean space. 
Now, in a celebrated argument, Bentley asks Newton if such a situation 
might not be able to account for the static equilibrium of all the stars, since 


Equivalence Between Newtonian and Relativistic Cosmology 351 


infinite attraction exerted by all masses on one side of some central particle 
would be compensated for by an infinite attraction countering it on the 
other side. Newton’s reply illuminates the kind of insuperable problems 
involved in any Newtonian physics that hopes to deal with the whole 
universe: ‘. . . if a Body stood in Equilibrio between any two equal and 
contrary attracting infinite Forces; and if to either of these Forces you add 
any new finite attracting Force, that new Force, how little soever, will 
destroy their Equilibrium, and put the Body into the same Motion into 
which it would put it were those contrary equal Forces but finite, or even 
none at all; so that in this Case the two equal Infinites by the Addition of 
a Finite to either of them, become unequal in our ways of Reckoning’ 
(Cohen [1958], pp. 295-6). These ‘ways of Reckoning’, according to 
Newton, are the bases of mathematical reasoning. If the infinites are equal, 
in accordance with Bentley’s physical conception, then no motion at all 
is possible in the universe, even though equilibrium would be realised 
everywhere by virtue of Nature’s own forces. But if the infinites are unequal, 
in accordance with mathematics, then there is motion but no equilibrium, 
i.e., gravitational collapse. In either event, we face the very same impossi- 
bility: the occurrence of any particular kind of finite force in the universe 
may be understood irrespective of its inclusion in the whole universe. 
Newton would most probably have justified in that way the legitimacy of 
such thought experiments as the motion of two spheres in an otherwise 
empty universe, but he abandons cosmology in the highly precarious 
position of appearing to explain absolutely nothing at all. 

The twentieth-century revival of Newtonian cosmology does indeed 
bring to the fore a concept which Milne himself sees as purely relativistic. 
This is the ‘cosmological principle’, which is accepted today by virtually all 
cosmologists, whatever their theoretical persuasion. Milne is often credited 
with the enunciation of this principle, even though he repeatedly accords 
the honour to Einstein’s celebrated cosmological paper of 1917 (see for 
instance Milne [1935], p. 24 and pp. 60 ff.). Yet, the formulation is quite 
absent from Einstein’s pioneering work; the term seems to have been first 
introduced as a tool for systematisation by Robertson in his review paper 
on the state of cosmology in 1933 (pp. 62—5). Einstein spoke of uniformity 
only to conform with what he took to be the facts as presented to us by the 
astronomers. There is no doubt that in Milne’s eyes the principle refers to 
a theoretical, yet covert aspect of Einstein’s thought. Apart from any dispute 
over terminology, this situation reveals much of the significance of the 
equivalence theme. Thus, an at times bewildering amount of confusion 
derives from the very positing of the initial problem. For instance, when 
Harrison raises the question of why Newtonian cosmology is apt to give 
the right answer to the problem of the universe, he puts forward the 
apparently sound objection that ‘matter moves through space in the New- 
tonian picture, and therefore at large distances the expansion velocity must 
exceed the velocity of light’, but ‘motion through space faster than light 


352 Pierre Kerszberg 


flatly contradicts relativity theory’ (p. 287). Now this does contradict the 
Newtonian theory as well, and Harrison omits to point this out. On another 
occasion, Harrison brings out the true consistency of relativistic cosmology 
by distinguishing recession velocities from ordinary velocities (p. 240). 
While the latter amount to a classical example of Doppler redshifts, the 
former apply to bodies comoving in expanding space. An expansion redshift 
may well exceed the velocity of light, for otherwise the law of velocity- 
distance proportionality would terminate abruptly at some conceivable edge 
of the universe. In his account, Harrison provides us with a profound 
insight into the real problem of cosmology in the 1920s, since the confusion 
between Doppler and expansion redshifts (t.e., between motion in space 
and motion of space) remains undetectable as long as the measured redshifts 
are small—as they indeed were at that time. But when he reverts to the 
acknowledged equivalence between Newtonian and relativistic cosmolo- 
gies. Harrison fails to take proper account of these faster than light velocities 
which are as much a challenge to relativity as they are to Newton’s theory. 
One way of overcoming the centre/edge paradox in Newtonian cosmology 
would be to articulate it in terms of an indefinitely large cloud of particles, 
but the difficulty then is that ‘this edge expanding faster than the velocity 
of light leads to contradiction and not to reconciliation between the New- 
tonian and the relativity pictures of the universe’ (p. 287). It is significant 
that Harrison has to fall back on rather feeble expressions about the 
universe, such as ‘picture of the world’. Any stringent distinction between 
the physical concepts and the corresponding models of the universe has 
become hopelessly blurred. 

This interplay between concept and world-model is entirely charac- 
teristic of present-day attitudes to equivalence, as exemplified in the work 
of Heckmann and Schiicking (see their detailed exposition of Newtonian 
cosmology in [1959], pp. 491-0, as well as some pertinent remarks in Sklar 
[1976], pp. 5—6). In order to demonstrate the consistency of Newtonian 
cosmology, such writers introduce concepts borrowed quite explicitly from 
relativity theories. So, armed with the notion of local inertial frames, it is 
easy enough to ‘trade-off’ non-inertiality and gravitational fields so as to 
circumvent the traditional problem of boundary conditions. As Sklar says 
with some emphasis, ‘a convenient comparison of related Newtonian and 
general relativistic models is easily made in its terms’ (p. 6). A good deal 
too easily, one is tempted to say. In the following pages, I shall attempt to 
show that the entire history of the problem of the equivalence, from Milne’s 
early paper to the present-day nearly unanimous agreement about its sup- 
posed logical soundness, can tell us a good deal about the conceptual 
foundations of this equivalence. And we shall see that these foundations, 
in turn, reflect a more profound problem. The subject is of considerable 
philosophical interest, inasmuch as it presents what, prima facie, would 
appear to be one of the very few instances where underdetermination of 
physical theories by physical facts has actually made its presence felt in real 


Equivalence Between Newtonian and Relativistic Cosmology 353 


life. An examination of the kind of response which has been made to that 
problem may uncover the epistemological roots of the whole twentieth- 
century attitude towards any conceivable science of the universe. 

First, and last, we should bear in mind Newton’s own attitude, as he 
expressed it to Bentley: the very fact that our world is obviously not uniform 
is the actual cornerstone of any science. What we find is the variety of 
planets, comets, stars; a common pattern of orbits for the planets around 
the sun, as opposed to the erratic motions of the comets themselves; one 
luminous centre for a multiplicity of opaque bodies: all this clearly bears 
. witness to God’s handiwork (see Cohen [1958], pp. 280-2, plus the com- 
ments of Westfall [1973], pp. 196~7). This kind of non-uniformity has 
been accounted for successfully enough by the application of the law of 
gravitation to the solar system. It is the very degree of success which brings 
the methodological aspect of Newton’s science in line with its supposed 
ontological dimension. Yet, when Newton comes to consider the extent of 
the whole starry universe as finite, in response to the mathematical diffi- 
culties of the infinite, the result should be, by rights, a central mass, blurred 
and undifferentiated. Newton refuses a solution after the planetary model 
in which centrifugal forces would deflect the fall of stars from their straight 
paths, on the grounds that the kind of diversity exhibited in the solar system 
simply could not occur in this sphere (Cohen [1958], p. 306). It is certainly 
with this particular problem in mind that Newton reverted to the famous 
solution adumbrated in the General Scholium that ends the Principta: ‘if 
the fixed stars are the centres of other like systems, these must be all subject 
to the dominion of One .. . and lest the systems of fixed stars should by 
their gravity, fall on each other, he hath placed those systems at immense 
distances from one another’ ([1934], p. 550). Newton seems to envisage the 
force of gravitation as being deliberately limited by the will of God. In order 
to get round what is in effect precisely the same problem, it is only a short 
step to propose the limited descriptions of the large-scale properties of the 
phenomenon of gravitation, which now hold sway in contemporary versions 
of Newtonian cosmology. In fact, straddling these two extremes, the pro- 
ponents of relativity and its early critics have both perceived the problem 
of a consistent Newtonian cosmology very much in terms of a postulated 
limit inherent in any scientific picture of the world. Of course, these limits 
have to be clarified if we want to comprehend the genuinely modern way 
of transcending them. 


2 EINSTEIN’S THREE ARGUMENTS AGAINST NEWTONIAN 
COSMOLOGY 


In the first part of his celebrated cosmological memoir of 1917 (pp. 177- 
9), Einstein puts forward three separate arguments against the possibility 
of a Newtonian cosmology. Throughout the nineteenth century, the failure 
of Newton’s physics to meet the requirements of so all-encompassing a 


354 Pierre Kerszberg 


problem as the entire universe was perceived in terms of an imperfect fit 
between a homogeneous, uniform model and the laws of the physical world, 
whether optical or mechanical. Thus, both Oldbers’ speculations on the 
sky being ablaze with infinite light or Neumann’s and Seeliger’s ad hoc 
modifications of gravitation law were dominated by the idea of a pre- 
established infinity and homogeneity. The tension between the idea of 
infinity and the physical laws was investigated in such a way as to restore 
what had seemed to be jeopardised by the extension of purely local 
considerations. 

One of the profound innovations in Einstein’s memoir is the critique of 
just this traditional attitude. Einstein clears his mind of what had become 
an extremely common prejudice. He does not set out to vindicate a par- 
ticular model but simply explores the actual consequences implicit in the 
large-scale predictions of the theory. There had been various attempts to 
do this sort of thing before, like Charlier’s early speculations on a hier- 
archical model ([1908]), but always with the aim to justifying an infinite 
universe. A possible source for Einstein’s radically original attitude may 
well be the mutating image of contemporary astronomy. By the turn of 
the century, the old eighteenth-century theory of an island universe was 
undergoing a major revival, in response to the problems raised by the new 
techniques of distance measurement and the growing controversy over the 
existence of extra-galactic nebulae (see R. W. Smith [1982]). But whatever 
we make of this, the belief in extra-galactic nebulae seems merely to change 
the units of the universe; the essential point to grasp is that the force of 
Einstein’s arguments derives from theoretical interests. 

The first argument deals with a spherically symmetrical gravitational 
field around some point of mass. The gravitational force of a homogeneous 
sphere of matter acts as if the whole force were concentrated at its centre. 
The difference in potential between any exterior point to the sphere and 
this centre tends to be independent of the distance, t.e., it tends to be 
constant, only when the density of successive layers decreases at a more 
rapid rate than 1/r?. What we arrive at, then, is a distance beyond which 
the exterior point is no longer exterior: it is indistinguishable from the 
original sphere. In this sense, Einstein claims, one can speak of the universe 
as a physically self-consistent system. Mathematically, this may be ex- 
pressed by a condition where the potential (which varies as 1/7) shall tend 
to a fixed value at infinity. In other words, the Newtonian theory predicts a 
non-uniform distribution of matter. Once a point can be found, where the 
gravitational field that surrounds it is isotropic, then that point becomes 
unique. The universe is therefore affected by a disjunction between space 
and matter: all points inasmuch as they are centres are equivalent with 
respect to space, but not with respect to the distribution of matter. Now, 
it could be argued that this disjunction is not really such a problem, at least 
in relation to boundary conditions. A constant, zero potential at infinity is 
the convenient device used in everyday physics, which enables the physicist 


Equivalence Between Newtonian and Relativistic Cosmology 355 


to ignore the incalculable influence of very remote regions on a given local 
system. So far as a local system is considered simply as a local system, the 
working assumption is that the rest of the universe does not exist: it is 
literally given the value of ‘nothing’. Now, the image of the island universe 
seems to be a natural corollary of such a view: if, for all practical purposes, 
there is absolutely no matter to worry about in the infinite, then the 
universe is directly comparable to the largest ‘local’ system that may be 
imagined. 

In his two succeeding arguments, Einstein manages to highlight the 
difficulties inherent in that apparently comfortable collusion between the 
methods of mathematical physics and the concrete figure of the universe. 
The snag comes in the form of applying another kind of physical law: How 
is it that the stellar universe, with the relative velocities of the stars, so small 
compared to the velocity of light, manages to keep itself going instead of 
scattering itself into infinite space? The radiation emitted by the peripheral 
stars leaving the Newtonian system, becomes ‘ineffective and lost in the 
infinite’ and Einstein fears that even entire heavenly bodies might end up 
the same way. A finite kinetic energy is certainly sufficient to overcome the 
Newtonian forces of attraction. According to Einstein the reality of that 
process can be established by the logic of statistical mechanics: ‘This case 
must occur from time to time, as long as the total energy of the stellar 
system—transferred to one single star—is great enough to send that star on 
its journey to infinity, whence it can never return.’ If the density decreases 
at a sufficient rate with the increasing distance from the centre, then the 
difference of potential between the centre and the infinite is, by construc- 
tion, finite. Now, statistical mechanics shows that the relation between 
stellar populations at any two points A and B is 


p(A)/p(B) = eA -EART 


where E stands for kinetic energy, k is Boltzmann’s constant and T is the 
temperature. If the potential at A remains finite while A tends to infinity, 
and if, furthermore, it is supposed that kinetic energy is everywhere finite, 
it follows that the population at infinity is not zero. Stars are apt to leave 
the supposedly finite system, quite apart from the radiations emitted by 
those bodies. The contradiction Einstein has in mind works like this: a 
constant value for the potential at infinity implies, according to physics, a 
finite universe but, again on physical grounds, this universe cannot remain 
finite in any strict sense. So the largest system is never large enough. 

At this point, Einstein’s conclusion seems to dramatise, with a striking 
singularity, the problem that lies at the heart of any cosmological theory. 
Let us take the uniform model, with its homogeneity and infinitude first. 
Starting from a certain point which we nominate the centre, we draw 
concentric spheres of matter. Any particle that is located at a certain distance 
from the centre will be subject to contradictory influences. Any matter 
outside the sphere whose radius is the distance between the centre and the 


356 Pierre Kerszberg 


particle will not act upon the particle, only the matter inside it can act upon 
it, as though all matter were concentrated at its centre. Now, if we draw 
the concentric spheres around the particle which we have taken as the 
centre, no matter should act upon it since the state it is in may be thought 
of as analogous to an inertial state. This puzzle led J. L. Synge, following 
on from an early suggestion of Maxwell’s (see Synge [1937], pp. 94-5 and 
Maxwell [1952], p. 85), to state the following paradox: there are postulated 
fields of gravitation incapable of detection by any observer, since it is always 
possible for a given observer in ‘free fall’ to consider himself inertial, 
provided he takes it upon himself to describe the world in terms of a 
gravitational field which is everywhere null, instead of the non-null field 
which ‘really’ exists. All we can say of any particle in free fall which is taken 
as an inertial centre, is that matter which acts or which fails to act upon it 
is not ‘physically’ the same in both situations—it is only ‘ideally’ the same, 
which amounts to transferring the ideality of inertia to the so far only 
hypothetical ‘reality’ of gravitation. And in terms of the cosmological ques- 
tion, this means that the uniform model entails difficulties of local scale, it 
being assumed that the boundary is any case merely ‘ideal’. From this point 
of view, Einstein’s argument is far-reaching enough, since it shows that the 
non-uniform model, with its ‘physical’ picture of the boundary, is simply 
the inversion of the original problem: the paradoxes are now occurring on 
a cosmic scale, because even the physics of the boundary cannot stay in 
place. 

Yet, the validity of Einstein’s arguments remains questionable. Indeed, 
Einstein speaks of the infinite as if it were some kind of place within the 
reach of radiation and the stars. Commentators have been struck by the 
highly dubious cogency of the argument (see North [1965], pp. 70-1). As 
a matter of fact, the idea of some form of depopulation of the star system 
is not borne out by the kind of statistics he marshalls in his arguments 
(useful comments are given by Pauli [1958], p. 162). A fully consistent 
application of the very same principles should have led him to predict a 
statistically equal number of stars coming back from the infinite. But the 
significance of Einstein’s argument is its general thrust rather than any 
disputed validity of its detail. In essence, the idea is that no definite peri- 
phery can be assigned to the unique centre of the island universe, except 
by equating the infinite with a specific location that remains within reach. 
While the disjunction between space and matter caused little difficulty in 
the first argument, it now becomes cumbersome: the unique point conceived 
as centre of the mass distribution is not related to a definite periphery, 
which means that the centre itself might lack definition as well, even though 
the island-like universe still necessitates a centre. Einstein proceeds to 
annihilate one possible way out of this quandary. One suggestive solution 
runs like this: ‘We might try to avoid this peculiar difficulty by assuming 
a very high value for the limiting potential at infinity.’ All bodies move 
from higher to lower potentials; even though the potential decreases from 


Equivalence Between Newtonian and Relativistic Cosmology 357 


the centre to the periphery, the boundary potential could be so high as to 
hinder celestial bodies from crossing it—in short we could posit a definite 
periphery. Alternatively, we could see the potential as increasing from the 
centre to the periphery until it reached a particular limiting value and as a 
result generated a movement towards gravitational collapse. This latter 
view is never for a moment entertained by Einstein. He pre-empts the 
entire line of reasoning by invoking facts presented to astronomers: the 
general weakness of stellar velocities imposes an overall distribution of 
potentials which must not be drastically different from those prevailing in 
the immediate vicinity of the earth. The logical slide is all the more amazing 
for being so inconsistent: Einstein, who had begun by demarcating a type 
of argument which is grounded in the predictions of theory finds it neces- 
sary, in order to oppose one of the possible predictions, to have recourse to 
nothing better than an accidental fact. Of course, he does have something 
in mind which acts, however latently, as a quite decisive justification——the 
space/matter disjunction cannot be refined out of existence by some sort of 
break between local and putatively cosmic conditions. 

This becomes quite clear in the third and final argument, where theor- 
etical requirements are no longer applied to the behaviour of single bodies 
within the universe. Einstein finds that the application of statistical pro- 
cedures to the whole of the island universe is devastating, because that 
universe is thereby deprived of existence: ‘If we apply Boltzmann’s law of 
distribution for gas molecules to the stars, by comparing the stellar system 
with a gas in thermal equilibrium, we find that the Newtonian stellar system 
cannot exist at all. For there is a finite ratio of densities corresponding to 
the finite difference of potential between the centre and spatial infinity. A 
vanishing of the density at infinity thus implies a vanishing of the density 
at the centre.’ This argument takes no account of the fact that there are 
only slight differences of potential in different parts of the universe. Einstein 
is quite concerned about the very idea of a finite difference between the 
infinite and any centre. If any difference of potential between r,, (r at infinite 
distance) and ro (the centre) leads to a finite quotient of densities, then 


PDIP a) = CMe) — Mera UAT 


where ® is the potential. If p(r,)/p(r..) is a finite number, a zero density at 
infinity implies a zero density at the centre. It should be emphasised that 
a potential is defined by a quotient of the form 1/r and a constant of 
integration. A sufficiently high value for the constant would have the effect 
of invalidating the whole statistical procedure, because the density at the 
centre would then be equal to the density at infinity. 

Einstein either could not see the difficulty or he pretended not to see it. 
It is true that the last argument is meant to round off the two preceding 
arguments. Its effect is to demonstrate that the presence of matter does not 
enable us to distinguish between the periphery and the centre of the island 


358 Pierre Kerszberg 


universe, In fact, the identity of centre and periphery is the distinguishing 
feature of an infinite universe. However alien to infinite and uniform homo- 
geneity the predictions of Newtonian physics may be, Einstein wants to be 
able to provide a demonstration that, in addition, these predictions are no 
better as physics either. Both Newton and Einstein are actually in agreement 
in their belief that the occurrence of the slightest force is indifferent to the 
whole Newtonian universe, whether that is thought of as uniform or non- 
uniform. Of course, a Machian orientation cannot accommodate itself to 
such a conclusion. 

The introduction of the cosmological constant Å, Einstein reminds us, 
does remove the difficulty. Although Einstein makes no specific citation he 
doubtless has in mind the work of Seeliger [1895] and Neumann [1896] 
(Einstein [1957], p. 105, does refer to Seeliger in an earlier work). The 
magical potency of this term is that any universe so constituted has no 
centre with respect to the gravitational field. To put things more precisely, 
‘the solution then corresponds to an infinite extension of central space, 
filled uniformly with matter’ ([1917], p. 179). It is striking to realise that 
neither Einstein nor Seeliger speculates on the possible value of 4. The 
term is taken almost as a matter of course justification for static equilibrium. 
Similarly, when a transformation of that kind is performed in the field 
equations of general relativity, Einstein assents to it only in as much as it 
makes possible such an equilibrium. It seems clear that static equilibrium, 
apart from being in keeping with the weak stellar velocities, is concomitant 
with the theoretical quest for an ‘extension’ of the centre. Thus, even 
though the finite, unbounded system provides an answer to the impossible 
periphery in its Newtonian equivalent, the demand for static equilibrium 
forms the basis of an analogy between the two. God’s providential in- 
genuity, Newton thought, had placed the stars at immense distances from 
one another. In actual fact, the cosmological constant does the same job, to 
such an extent that Einstein’s model is equally indebted to it. 


3 WEYL’S FIRST GRAPPLINGS WITH COSMOLOGY 


In his early critical account of Einstein’s theory, Weyl makes a fresh start 
on the problem of Newtonian cosmology ([1921], pp. 277 ff.). On episte- 
mological grounds derived from an analysis of the foundations of general 
relativity, he believes that the physical laws as they are expressed in differ- 
ential equations remain complete in themselves and have no need for further 
supplementing by such limitations as special values for the constants of 
integration. This line of argument is worlds away from Einstein’s decision 
to disregard boundary conditions, since Wey1’s view derives its consistency 
from the meaning of general invariance in the laws of physics. Just as the 
Riemannian metric of space-time retains at every point the Minkowskian 
signature, so the neighbourhood of every point is correlated to a time- 


Equivalence Between Newtonian and Relativistic Cosmology 359 


coordinate which is intrinsically distinguishable from space coordinates. 
The distinction is similar in any possible neighbourhood, no matter how 
the density may vary from point to point. A sort of cosmic (though quite 
formal) concept of time does emerge from this construction. Weyl approxi- 
mates to the level of reality when he raises the question of an analogy 
between the stellar system and a gas in statistical equilibrium, citing 
Maxwell’s law of distribution which predicts a much higher frequency of 
small velocities and an average uniformity in their distribution. Of course, 
were the analogy valid, the theoretical demand of cosmic time would have 
its counterpart in nature. Not surprisingly, Weyl deduces from the injection 
of this demand into the field equations an equation quite similar to that of 
Poisson, so that the density is made to vanish everywhere so long as the 
potentials remain constant. The next step in the sequence is the introduction 
of the cosmological constant. Whereas the Newtonian theory demands 
p =A, the relativity theory allows for the curvature of space being sub- 
stituted for the density. In other words, Weyl argues that the Newtonian 
island universe finds its natural corollary in the relativistic schema, provided 
that the boundary of the extension of matter is replaced by the curvature 
of the extension. 

Weyl’s strategy is a most interesting case, because it is simply a reversal 
of Einstein’s arguments. In the first place, Weyl assumes uniformity to be 
a true prediction of both the Newtonian and Einsteinian theories. The next 
step is that where Einstein claims that A has the sole purpose of complying 
with the static equilibrium, Weyl proceeds to show that it closes space, the 
equilibrium being the necessary premise. What this boils down to is that 
the difference between a Newtonian and a relativistic cosmology does not 
lurk in the latter’s closure of space, but rather in the hierarchy, in the mere 
order of what is to count as premise and what as conclusion. To Weyl, at 
any rate, such an upshot causes the gravest doubts. That A should define 
the radius of curvature of the whole universe is a ‘great demand on our 
credulity’. The words are no more than the amplified echo of all Weyl’s 
philosophy of relativity. Thus, speculating on the possibility of light-cones 
overlapping when they are trapped in a closed geometry, so as to gain an 
influence of the active future over the passive past, Wey! does acknowledge 
‘that there is a certain amount of interest in . . . these possibilities, in 
as much as they shed light on the philosophical problem of cosmic and 
phenomenal time’ (p. 274). Nevertheless, general relativity ‘merely assumes 
that the neighbourhood of every world-point admits of a singly reversible 
continuous representation in a region of the four-dimensional “‘number- 
space” .. .; it makes no assumption at the outset about the inter-connection 
of the world’ (p. 273). If assumptions of that kind are to be put forward, in 
the manner of Einstein or de Sitter, no conclusive argument can be found 
so that the gap between the two models then available is seen by Weyl in 
the form of an irreducible antinomy: the world is either full of matter but 
not homogeneous or it is homogeneous and utterly empty (pp. 283-4). 


360 Pierre Kerszberg 


4 WEYL’S PRINCIPLE AND THE REVIVAL OF NEWTONIAN 
COSMOLOGY 


In the years that passed between the first enunciation of relativistic cos- 
mology and the rise of cosmologists like Milne and McCrea, the whole 
complexion of the problem underwent many transformations. Before the 
establishment of the non-statistical metrics, a significant degree of suspicion 
arose as to the validity of a statistical argument being applied to the whole 
universe. De Sitter was probably the first to point out the difficulties 
inherent in such an approach ([1921], p. 867). He claimed that ‘the idea of 
evolution in a determined sense appears ... to be rather opposed to the 
actual existence, if not to the possibility of equilibrium’. As an astronomer, 
he was bound to be powerfully affected by the latest research on galactic 
formation. Eddington had suspected, as early as 1914, that stars in the 
Milky Way exhibited the pattern of the rather early stages of evolution— 
whereas statistical interpretation could cope with a static equilibrium only 
in terms of the final stages of an evolutionary process. It was this dichotomy 
which induced Wey] ([1923]) to incorporate a principle of common origin 
into cosmology, thereby providing that speculative science with what has 
since been referred to as its most fundamental assumption about the inter- 
connectedness of the world—Weyl’s Principle. 

For all this and going on at the same time, attempts were being made to 
save the integrity of Newtonian cosmology by suppressing the boundary 
conditions altogether, as Einstein had done in the framework of relativity; 
and these occurred long before Milne. In 1922, Charlier reverts to the 
Newtonian model he had first proposed in 1908 by articulating what became 
known as a hierarchical model, which drew on the speculations of a number 
of later eighteenth-century thinkers such as Kant and Lambert. According 
to Charlier’s model both the optical and the mechanical paradoxes of New- 
ton’s infinite universe are overcome when a specific arrangement between 
the different celestial systems is introduced: planets form planetary systems, 
planetary systems form galaxies, galaxies form clusters, and so on without 
limit. An appropriate choice of dimensions can make the average density 
appear to equal to zero. To the eyes of most cosmologists today, ‘these 
models are little more than a curiosity’ (Rindler [1977], p. 196). The sad 
fact is that Charlier’s universe is not really homogeneous (no volume is 
large enough to be typical), even though it may be claimed that the indefinite 
repetition of similar structures makes it tend towards homogeneity. But the 
real unwieldiness of his system is different again: this universe is only an 
indefinite series of local systems, without the slightest evidence of a natural 
connection between them. Of course, the appeal to a cosmological constant 
might seem just as arbitrary as this process of clustering. And following 
this line of argument, one of Charlier’s most impor- 
tant supporters, Franz Selety ([1922], pp. 291-2), brought into question 
once more the validity of Einstein’s arguments on boundary conditions. In 


Equivalence Between Newtonian and Relativistic Cosmology 361 


principle, Selety argues, the possibility of infinite differences in potential 
has not been ruled out conclusively. A physics is construable without any 
need to define the potential at infinity. In order to calculate potential 
differences, one could begin with some equipotential surface (which can be 
determined empirically) whose value is assigned quite arbitrarily, and then 
reckon all other potentials either negatively or positively, a procedure quite 
compatible with the occurrence of infinite differences of potential. The 
fixing of boundary conditions is usually regarded as purely mathematical, 
in the sense that it enables the physicist to isolate a given system. Selety’s 
dominant idea seems to be that the superimposition of a physical meaning 
onto it is scarcely a proper way to convey a just or feasible representation 
of the entire universe: some element of arbitrariness being unavoidable, 
this can be located at will, whether in the infinite or elsewhere. 

It was Einstein who, in other writings, produced yet another argument 
against the case of Newtonian cosmology. A first version of it is to be found 
in the early elementary exposition of general relativity ([1957], p. 106); it 
is repeated in a 1932 paper (see Einstein [1933], p. r01). In it, Einstein 
refers to Gauss’ theorem. The number of lines of force coming from the 
infinite and reaching a mass m is proportional to m. If the density dis- 
tribution of matter in the universe is a constant p, a sphere of volume V 
has a mass pV, so that the number of lines of force passing through the 
sphere is proportional to pV. But the surface of the sphere is proportional 
to R? and its volume to R’. As a result, the number of lines of force which 
pass through each surface unit of the sphere is proportional to pR. The 
intensity of the gravitational field on the surface thus increases indefinitely 
with the radius R of the sphere. Disputing this conclusion, which Einstein 
found unacceptable, Milne countered with an argument of some episte- 
mological scope ([1935], p. 300). He claims that Einstein was confusing 
concepts and observable entities: ‘the notion of the “intensity” of a gravi- 
tational field’, according to Milne, ‘is a pure concept’, whereas the mathe- 
matical proof appeals to yet another concept, that of lines of force. Einstein 
shows Newtonian cosmology to be impossible by exhibiting the paradoxes 
of a physical picture of the infinite, but his boundary conditions are pri- 
marily hypothetical and are therefore physically meaningless. Milne prefers 
to test the validity of the model by referring to ‘the accelerations actually 
undergone by the particles present and capable of observation by other 
particle-observers’. 


5 THE MILNE-McCREA CONTRIBUTION OF 1934 


Certainly Einstein did not revert to Newtonian cosmology after he had 
disposed of boundary conditions in general relativity. Nor did he make any 
serious attempt to revise his first model after Friedmann had shown in 1922 
that one could dispose of A either, provided a dynamic picture was deemed 
conceivable. If you were to suppress both the boundary conditions and A 


362 Pierre Kerszberg 


in the Newtonian static model, the result would be the process of clustering 
described by Charlier. What remained was to evaluate the potential conse- 
quences of dynamism as something capable of restoring unity. A con- 
sideration of this question leads us, inevitably, to the work of Miine. 

Milne’s introduction to his paper highlights much of the significance of 
the equivalence theme. It is no accident that a comparison between the two 
theories is first made in relation to the so-called Einstein—de Sitter model. 
In 1932, Einstein and de Sitter had reconciled (in a manner of speaking) so 
that they were able to propose a new model for the universe in a joint paper. 
There is no point in denying that Einstein does his best to beat a retreat in 
that model. In this new theory, ‘there is no direct observational evidence 
for the curvature [of space]. . . . It is therefore clear that from the direct 
data of observation we can derive neither the sign nor the value of the 
curvature, and the question arises whether it is possible to represent the 
observed facts without introducing a curvature at all’ (p. 213). In a manu- 
script dating back to some time in September 1932, Einstein wrote that a 
non-zero density of matter does not necessarily imply a spatial curvaiure 
(the assumption of the 1917 paper de Sitter had tried to refute); on the 
contrary it was a spatial expansion ([1933], p. 109) and therefore the conse- 
quent model is that of a Euclidean expanding space, which will yield the 
simplest of all Friedmann universes (with 4 = o). The option chosen is a 
telling sign of the theory behind the newly discovered physics of non-static 
metrics. The Friedmann equation (2) relates the variation of the scaling 
factor with time to the density of matter, the geometry of the universe and 
(at least very possibly) to the cosmological constant. The difficulty is that 
a redshift measurement only tells how much the universe has expanded 
since the epoch of emission. Distances, recession velocities and the lookback 
time all depend on the geometry of space and on how the scaling factor 
changes with time. As long as no test is available which can single out one 
of these factors and provide an independent measurement of it, a necessary 
choice from among the relevant quantities must be made at the outset. The 
Einstein—de Sitter model opts for a simple geometry; as the geometry is 
fixed, the corresponding distances, the recession velocities, the lookback 
times and the age of the universe can be calculated. In the book he was 
writing at the time, de Sitter proffers the opinion that ‘we shall never 
be able to say anything about the curvature without introducing certain 
hypotheses’ ([1932], p. 117). 

Milne seems to conceive of no higher aim than attacking the cosmological 
problem in just that hypothetical and speculative spirit. But an essential, as 
opposed to a momentary solution, can be found only if the very concept of 
space is seen as a mathematical construct devoid of any physical significance 
(see Milne~McCrea [1934], p. 64). The choice of a particular space is not 
constrained in any way by physical considerations, but is simply a matter 
of convention. Following Poincaré, Milne opts for Euclidean space. In 
contrast to the Einstein—de Sitter selection of this space, Milne’s procedure 


Equivalence Between Newtonian and Relativistic Cosmology 363 


restricts the class of possible universes by overcoming the need for arbitrary 
reduction in the number of independent variables. In short, the guiding 
line of reasoning in Milne’s treatment of the problem is the theory of 
kinematic relativity which Milne had just begun to elaborate a year earlier 
({1933]). This theory is supposed to be an account of the actual systems 
presented to observers; in the treatment of those systems it denies the 
validity of any assumed theory of gravitation. 

Because no content can be attached to the phrase ‘space of nature’, the 
comparison between Newtonian and relativistic cosmologies focuses on the 
possibility of a locally Newtonian time in relativistic cosmology (Milne— 
McCrea [1934], p. 71), with the provision that in dealing with Newtonian 
time one ‘assumes the usual definition of simultaneity by means of light- 
signals’ (p. 65). The result is a solution to the miscomprehended problem 
of motion versus uniform distribution in a Newtonian universe. Milne’s 
cloud of freely moving particles conveys the idea of matter and motion as 
given simultaneously, in contrast to the more classical concepts of motion 
as disturbing a pre-existing uniform distribution or pre-existing non-homo- 
geneity as precluding the motion which would effect a return to uniformity 
(this latter view is Newton’s argument to Bentley, letter IV). Both of 
Einstein’s objections fade away: there is no unique centre and the infinite 
is not a ‘place’. The demand of perpetual] interaction between particles and 
all forms of radiation is declared by Milne to be a metaphysical demand 
({1935], p. 301). Milne shows ‘that Newtonian systems can be constructed 
in which the relative accelerations are small near the observer and every- 
where finite in his experience’, thereby narrowing the field of causal inter- 
action to the sphere of observability of a given observer. This is a new kind 
of demand which is derived from the special theory of relativity, a theory 
which Milne indeed claims to have incorporated, to a large extent, in his 
kinematic theory. Still, Milne is clearly not at ease in doing so because he 
reverts to the large-scale point of view in a moment of muddleheadedness: 
the notion of curvature of space, he says, ‘is merely a mathematical device 
for describing on both the large and the small scale what is equally well 
described on the small scale (locally) in Newtonian terms, and what is 
formally describable on the large scale in the same terms’ (p. 316). The 
recourse to a merely formal equivalence when the large scale is contemplated 
is, in fact, the crucial give-away which highlights the crucial issue. 


6 LAYZER’S CRITICISM AND McCREA’S RESPONSE 


In the first edition of his [1952] textbook, Bondi emphasises the ambiguity 
of the definition of inertia according to the Milne—McCrea model, since all 
reference systems are assumed by it to be both inertial and accelerated 
relatively to each other. Bondi tries to bypass the difficulty by arguing that 
‘as long as we assume that each observer only uses his own system no 
difficulties or contradictions arise’ (p. 178). Whatever the merits of his 


364 Pierre Kerszberg 


resolution may be, if Milne does not seem to have perceived the problem, 
Bondi can hardly be said to have resolved it. Indeed, Milne’s problematic 
solution is one version of the clash between the Newtonian and the rela- 
tivistic class of allowable transformations of coordinates, while Bondi’s 
answer simply discards the problem, failing to wrestle with it at all. Milne’s 
original intention was to abandon the relativistic transformation of coor- 
dinates and replace it by transformations from observer to ‘equivalent’ 
observer, where equivalence is to be ‘defined in terms of observations and 
tests which the observers can actually carry out’ (Milne [1935], p. 5). Milne 
does not address Bondi’s problem, because his own idea of transformation 
between equivalent observers dominates all his conclusions, and it is a 
conception in which any distinction between inertial and accelerated 
observers becomes irrelevant. Therefore, when the Milne-McCrea model 
is characterised by, and discussed on the basis of, an interplay between 
Newtonian and relativistic concepts, the specification of which particular 
concepts are being used is always essential. Without such critical assess- 
ment, it becomes difficult to get away from flat assimilation of concepts, 
denied of any genuine equivalence. For instance, it is not clear whether 
Bondi conceives of his solution as restricting the validity of the postulated 
equivalence or as the distinctive feature of the Milne-McCrea model of the 
universe. 

In a paper which appeared two years later, Layzer [1954] went on to deal 
explicitly with Bondi’s statement of the problem. In point of fact, Layzer 
is working strictly within the framework of general relativity. His claim is 
that all neo-Newtonian derivations of the Friedmann equation are invalid, 
because the Milne-McCrea model is incompatible with the Newtonian 
conception of gravitation. The same kind of criticism had already been 
levelled long ago at the theory of kinematic relativity by Robertson ([1935] 
and [1936]). Robertson endeavoured to show that the law of gravitation was 
not analytically derivable from the operational definition of the kinematic 
substratum. The whole programme of a deductive cosmology therefore 
seemed to be undermined, since the determination of the laws of nature 
could not be obtained unequivocally from the description of cosmic struc- 
ture. Layzer seems to employ a comparable line of argument when he 
demonstrates that the Bondi problem evaporates only in the case of unac- 
celerated expansion (p. 269). This case takes its bearing from the propo- 
sition that ‘all forms of Newton’s law of gravitation depend on the idea 
that the specific gravitational force at every point is determined by the 
instantaneous distribution of matter in the universe, and is independent of 
the state of motion of the matter’. Now, because the distribution of matter 
taken alone defines no preferred direction in space, it follows that the 
potentials are constant and identical everywhere. The expansion of Milne’s 
cloud of particles is as a consequence not accelerated. Layzer could have 
concluded just as easily that no motion at all is possible, since he quite 
deliberately treats the distribution and the motion as each being on a different 


Equivalence Between Newtonian and Relativistic Cosmology 365 


footing. The distinction is contrary to Milne’s approach, and highlights the 
fact that Milne is interested in much more than the mere form of Newton’s 
law. Not surprisingly, then, Layzer reveals the invalidity of the Newtonian 
derivation of the Friedmann equation by underlining its incompatibility 
with the following pair of statements: (a) the relative accelerations are equi- 
valent to gravitational forces, as required in the Milne—McCrea theory; (b) 
the instantaneous distribution of matter determines the gravitational force 
at any given point. It should be clear that it is the distinction between the 
distribution and the motions which dictates the criticism. Of course, it 
would be futile to deny that Layzer’s arguments are mathematically sound. 
The point is that, by clinging to methods of mathematical physics acknowl- 
edged as valid in normal circumstances, they fail to take account of Milne’s 
intention. 

In response to these arguments, McCrea rehearses a more radical view of 
the subject ([1955], p. 273—Milne had died five years earlier). If we take a 
system with uniform density, McCrea says, and ‘if the gravitational force is 
to be defined in the present (Newtonian) manner, then it does not exist . . .’; 
the fact that the force shall be zero in that case cannot be inferred, since 
the very concept of force becomes nugatory. Primarily in this response 
to Layzer McCrea manages to perceive the limiting role of traditional 
mathematical physics in Milne’s general approach. By the same token, 
McCrea solves the Bondi problem by realising that there is only one ideal 
Newtonian reference frame, t.e., that which is located at the ‘true’ centre 
of the cloud. All properties relative to any observer moving with the material 
of the universe can be deduced by the resulting motion relative to that 
ideal frame, the resulting motion being obtained from purely Newtonian 
kinematics (p. 272). McCrea’s achievement is to have revealed that the 
simultaneity distribution/motion is necessarily dependent on the possibility 
of a particular unique centre. 

His conclusion is so far-reaching because the idea of a unique centre 
resurrects the old question of the universe having a determinate boundary. 
Layzer’s paper is primarily concerned with that problem. The dynamics of 
a uniform system with zero pressure is proved to be the same, regarding 
gravitational properties of an expanding sphere, when the system is an 
island-like universe and when space itself is expanding. In this respect, the 
Newtonian theory is quite satisfactory, given that the dynamical properties 
are rendered without approximation by that theory. Yet, the equivalence 
collapses when a uniform, unbounded distribution of matter is in question. 
It is notoriously the case that a unique solution for the potential cannot 
be derived from Poisson’s equation when boundary conditions remain 
unspecified. Milne perceived the problem quite clearly, since he first con- 
structed the equivalent of the Einstein-de Sitter model by stating the 
parabolic velocity of escape from a certain sphere. In writing that equation, 
Milne says, ‘We are not using the notion of gravitational potential, here 
inapplicable, but are employing simply an integral of the equation of motion 


366 Pierre Kerszberg 


with a particular value of the constant of integration’ ([1934], p. 68). In 
fact, the particular value was zero, though it was later shown to be positive 
in the case of hyperbolic orbits, negative with elliptical orbits. Milne’s 
supposition is that the conditions at infinity are simply compatible with the 
assumption that the material outside the sphere can have no influence on 
the motions inside it (as in the well-known theorem of mechanics). Layzer, 
on the other hand, emphasises the Newtonian-relativistic equivalence in a 
bounded system by recalling two theorems of Bondi. One of these states 
that it is by neglecting the influence of exterior matter that we are enabled to 
determine without approximation the interior motions via Newton’s theory. 
Without the very existence of some frontier as a necessary precondition, 
the equivalence will not be applicable. 

In the light of this criticism, McCrea now modifies the whole position as 
it had been accepted hitherto. ‘We can suppose’, he writes, ‘any observer 
to have a finite range of observation. So, in particular, we can take the 
extent of the system to be arbitrarily large compared with this range’ 
({1955], p. 272). As a result of this arbitrariness, only a small fraction of 
the observers will perceive the edge of the world. So, ‘the difference between 
an arbitrarily large system and an unbounded system is scarcely significant’. 
At the same time, McCrea argues that a consistent picture of a truly 
unbounded system would demand an altogether new meaning for the con- 
cept of a gravitational field. Yet, he seems to annihilate the profundity of 
such a remark when he calls Layzer’s objection merely a matter of ‘definition 
of “Newtonian” gravitation for an unbounded system’ (p. 274). It is true 
nonetheless that he saw his own concept of the arbitrarily large system as 
providing an appropriate ‘operational standpoint’ (p. 272). 


7 MORE RECENT DEVELOPMENTS 


When Bondi returns to the question himself in the second edition of Cos- 
mology, he reminds the reader of the recent elucidation of the potential 
problem in an infinite system articulated by Layzer and McCrea (p. 79). 
He then begins his own deduction with a characteristically loose for- 
mulation: ‘for our purposes it is sufficient to use Poisson’s equation’. 
Bondi is therefore quite prepared to abandon the classical requirement of 
boundary conditions. There is little doubt that he does so in the light 
of relativistic cosmology, since he also introduces into his discussion the 
cosmological constant 4 (which Milne did not). In somewhat uncritical 
fashion, he claims that the change is entirely arbitrary in the Newtonian 
theory, but not in the relativistic one. Accordingly, Bondi incorporates 
A in Poisson’s equation as ‘the Newtonian analogue of this relativistic 
procedure’. From now on, the high degree of comparability rather than the 
equivalence between the two cosmologies is the theme which Bondi wants 
to follow. John North has remarked that the absence of boundary conditions 


Equivalence Between Newtonian and Relativistic Cosmology 367 


in these classical equations allows for as indefinite a number of density 
distributions as are compatible with a given gravitational field ([1965], p. 
183). He says that ‘it might be reasonably objected that a theory which can 
explain a whole range of possibilities can have little predictive power in the 
relevant respects’. Yet, the contrast with any other type of multiple solution 
is difficult to assess here. In particular, the range of possibilities predicted 
by relativity does not entirely escape some form of logical problem which 
affects all procedures spoken of so far. The problem has been given a very 
clear form by Bertrand Russell in one of his most striking pieces of 
argument. Referring to the Machian conception of dynamics, Russell 
equates its logical basis with the idea ‘that all propositions are essentially 
concerned with actual existents, not with entities which may or may not 
exist’ ([1937], p- 493). But there is no logical contradiction in the possibility 
of laws being applied to universes which do not exist. In the calculations of 
the distribution of matter, ‘it can be no necessary part of their meaning to 
assert the existence of the matter to which they are applied’. The conse- 
quence of this is that the universe is brought into being as many times as 
there are possible distributions of matter. 

At almost the same time as North’s criticism there appeared a paper 
which can be seen as the standard formulation of the present-day ratifying 
of the idea of equivalence (Callan, Dicke and Peebles [1965]). Significantly 
enough, just as Milne had designed his model in order to overcome the 
fundamental indetermination arising from the orthodox relativisitic 
solution, so this first complete assent to the equivalence was developed 
hand in hand with a new conceptual tool that seems to lay the foundations 
for an alternative approach to the question of the universe. The authors 
make a frontal attack on the puzzling question of a possibly devastating 
effect of the expansion of space. They want to substitute the expression 
‘expansion of the universe’ for ‘expansion of space’, claiming that only the 
former concept can release physics from the spell of a mysterious power 
that initiates the pulling apart of galaxies. Motion in space rather than of 
space is the key to the proposed substitution, yet no discussion is proffered 
as to its meaningfulness. Newtonian mechanics, as a general and loose 
term, is supposed to provide ‘neither a crude approximation to the correct 
relativistic calculation, nor a cooked-up montage cleverly contrived to look 
like the real thing’ (p. 105). That Newtonian mechanics is entirely adequate 
is a conclusion the proponents reach in a rather revealing manner. Of 
course, no one wants to deny that the classical concepts fail to describe the 
dynamics of vast regions separated by velocities comparable to the velocity 
of light, and this is the authors’ first admission. The calculation of R(é) is 
possible in Newtonian terms only for two neighbouring galaxies, but the 
argument then moves without the slightest logical transition to the assertion 
that the calculation is ‘also applicable to the whole universe’ (p. 108). 
Implicitly recalling one of the Bondi theorems already worked out by 
Layzer, the conclusion is that isotropically distributed matter has no physi- 


368 Pierre Kerszberg 


cal effect on the interior of a large spherical volume, irrespective of the 
Newtonian or relativistic nature of these effects. 


8 THE MATHEMATICS VERSUS THE PHYSICS OF THE 
EQUIVALENCE 


When Schicking reverts to the problem of the universe as an unbounded 
totality in his 1967 papers, he is right to emphasise that we cannot neglect 
the gravitational field contributed by all matter, but he also shows a way 
out of the difficulty by transforming the whole problem. He re-shapes the 
problem so that it becomes a problem of the operational definition of the 
inertial system. Now clearly such a definition can be given only in the 
absence of matter, t.e., when the influence of gravitation is ignored by some 
sort of absolute decree (along the lines of what Newton had already done 
in the celebrated thought experiment of two rotating globes in an otherwise 
empty universe). In order to get round this kind of difficulty, Schticking 
introduces the characteristically relativistic notion of local inertial systems 
and then finds the appropriate law of transformation for the potential, 
ensuring overall finite solutions ({1967a], p. 274 and [1967b], p. 223). 
This fits in with his general definition of the particular subject matter of 
cosmology, that is, ‘all circumstances of which we have positive knowledge’ 
({1967b], p. 221). In contrast, Einstein’s original discussion arose from a 
clear division between two kinds of problem ([1957], pp. 71-3 and p. 105), 
where the definition of the inertial system and a consideration of the 
universe as a whole were conceived of as two quite distinctive endeavours; 
their only connection being simply that they are two factors which compel 
us to abandon the framework of classical physics. At an opposite extremity, 
the case of Layzer is also enlightening. Originally Milne intended no more 
than establishing an equivalent to the Einstein—de Sitter model on the basis 
that both models predict the same observable entities in local terms. Yet, 
Layzer invalidates the Newtonian derivation of the Friedmann equation 
because the equivalence breaks down in the case of an unbounded system; 
this does not prevent him from reinstating equivalence of the local kind by 
recapitulating Bondi’s theorems. 

In view of these waverings, it is desirable to go back to Milne’s original 
programme in its most fully developed form. Milne wanted to demonstrate 
(following S. R. Milner—see Milne [1934], pp. 65—6) that the same equa- 
tions are apt to receive different interpretations. The contents of the two 
theories therefore do not in any simple sense yield their essential feature 
by virtue of the accompanying equations. In this context, Milne’s suspicion 
is reminiscent of a much earlier idea on the degrees of equivalence between 
the two theories. Prior to the first relativistic model of the universe, de 
Sitter had already cast some light on a particular degree of comparability 
({1916]). His claim was that different equations would yield the same type 
of solution, were the relativists to maintain the requirement of determinate, 


Equivalence Between Newtonian and Relativistic Cosmology 369 


universal values for the constants of integration occurring in the solutions 
of differential equations. De Sitter argued that to uncover the essence of 
general relativity, it was necessary to show that ‘the differential equation is 
the fundamental one, and the choice of the constants of integration remains 
free’ (p. 529), whereas nothing relativistic would be revealed by any New- 
tonian theory whose absoluteness would always manifest itself in the pre- 
scription of the constants. Both de Sitter and Milne strive to restore what 
they claim to be the essence of general relativity, de Sitter with the intention 
of enhancing it and Milne in order to discredit it. Milne’s strategy is 
diametrically opposed to de Sitter’s: he shows that the same equation (the 
Friedmann equation) allows of quite different interpretations. 

The status of the whole question is therefore best apprehended with 
reference to the actual differences between the Newtonian and the relativistic 
models of the universe. In actual fact, the physics of the two universes are 
not exactly the same, as Bondi has emphasised with some vividness: 
‘,.. the difference between relativistic and Newtonian theories is governed, in 
cosmology, by the ratio of pressure to density, and. . . the ratio of gravi- 
tational potential to rest mass is important mainly in local applications’ (p. 
104). Bondi obtains this result by comparing the respective derivations of 
the Friedmann equation. A dynamic interpretation of the Robertson— 
Walker metric (4) implies a simplified form of the material tensor, where 
the material density and the isotropic pressure p are functions of the time 
alone. A straightforward consequence of the resulting relations is the equa- 
tion dE+pdV = o, verifying the law of the conservation of energy. Now, 
a consequence of equation (2) using the Newtonian equations is dH = 0, 
which shows that any full equivalence between the two theories demands 
that the pressure be very small with respect to the energy due to matter. 
This, indeed, is quite a good approximation of reality and has nothing to 
do with the theory: the kind of approximation involved here is not of the 
usual type between a Newtonian and a relativistic equation, since nothing 
enables us to say that the pressure is a purely relativistic effect. Bondi makes 
a further confusion between reality and theory when he concludes that ‘in 
a universe like ours is now, where both these ratios are always very small, 
relativity cannot offer anything radically new’ (pp. 104-5). The conceptual 
nature of the alleged equivalence is virtually concealed when it is arrived 
at in this lopsided way. 

A suitably generalised discussion would therefore have to be of a precisely 
opposite kind. As it happens, it has already been carried out in a peculiarly 
neglected paper published by McVittie some two years after the first edition 
of Bondi’s book ([1954]). This is the text of a paper delivered as part of a 
symposium on the structure of the universe in December 1953. Layzer 
must have known it, since he himself contributed a paper on the origin of 
the solar system. McVittie is not at all concerned with any deduction of the 
Friedmann equation from Newtonian postulates. On the contrary, he 
adopts the opposite strategy of seeking a Newtonian approximation to the 


370 Pierre Kerszberg 


relativistic formulae, explicitly given at the outset. His discussion is not 
even limited to a priori uniform models: the demand is primarily for a 
theory of spherical symmetry, simply because ‘we can observe the universe 
from one point in it only, namely the earth’ (p. 173). Uniformity is intro- 
duced much later for the sake of comparison in terms of putative equiva- 
lence. The approximations are twofold: the constant of gravitation in the 
field equations (higher powers of x than the second are neglected), and the 
velocity of light (c is identified with an infinitely large constant, since this 
is the only value which remains unchanged in the transformations from one 
inertial system to another). McVittie proceeds to articulate a Newtonian 
approximation of the relativistic equations in the case of spherical 
symmetry, and he finds that density and pressure gradients are included in 
the corresponding formulae. Another restriction is the limitation to uniform 
models. At this stage, the impossibility of using a co-moving coordinate 
system in the Newtonian instance (there is an absolute space in which 
matter moves) reveals itself as crucial. For that reason we find that the 
density, pressure and radial velocity of the cosmic fluid are independent of 
each other—a significant contrast to the relativistic case. The important 
point is that, whereas the density and radial velocity are known functions 
of absolute time, the pressure remains an arbitrary function of it, and 
therefore ‘it cannot be shown that a relativistic model in which p=o 
corresponds to a Newtonian model in which p = o also’ (p. 180). There is 
some significance here in McVittie’s preference for the term analogy rather 
than equivalence. 

Thus, in the relativistic case, density and pressure are interdependent, 
in accordance with the already established superiority of general relativity 
over Newtonian physics (relativity links previously unrelated laws of 
nature, such as the identity between the inertial and the gravitational mass, 
Poisson’s equation and the conservation laws). Yet, a unique and dynamical 
interpretation of R(t) cannot be inferred from that interdependence, even 
in the case of uniformity. This exemplifies the a priori irrelevance of general 
relativity to any cosmological consideration, a fact that Weyl pointed out 
as long ago as 1918. On the other hand, the independence of pressure 
and density in the respective Newtonian models does imply that R(ż) is 
predetermined, to use McVittie’s terms (p. 179). 

The last possible point of parallelism concerns the cosmological constant. 
McVittie links this problem with the pressure. It should be realised how 
Einstein himself hit on the occurrence of such a constant in his first model. 
He obtained the metric of the cylindrical universe in a way that is very 
similar to the Schwarzschild interior metric. A priori, the metric in both 
systems is of the form 


ds? = —e'd —r'd@—? sin? Odo? + etde 


where y, u, 0 and ġ are functions of r only (9 and ġ are also independent of 


Equivalence Between Newtonian and Relativistic Cosmology 371 


v and 4). This is reducible to the Minkowski metric when both v and yp are 
zero, so that the problem is to determine v and yp in relation to the actual 
distribution of matter. Both the interior and the cosmological metrics yield 
the following system of equations: 


(a) 8lIp=e(W/r+1/r)—1/r 
(b) 8p =e" /r—1/r’)+1/r° (5) 
(c) dpjdr = —[@+p)/2]y’ 


where (c) is the fundamental equation of perfect fluids. The two metrics 
differ in the choice of boundary conditions. Firstly, the cosmological metric 
demands that y and p tend to zero when r itself tends to zero, while the 
interior metric demands the same when r tends to infinity. Secondly, there 
is a definite option for the boundary condition of the pressure in the 
interior case (a decreasing pressure from centre to periphery), while the 
cosmological metric requires a constant pressure throughout space. A null 
pressure at the periphery leads to 


p+p = AH? 


(where A is a constant of integration), the solution of which provides the 
well-known interior metric. An overall constant pressure, on the other 
hand, leads to 


WPK =o. 


As a consequence of yp’ = o, we have u = const. The function yp is even zero, 
since e" = 1 for r = o. From equation (5a), 


(8p)? = e7’—1 


i.e., e" = 1—7'(—8I1p). With this relation, the metric of the cylindrical 
universe can be written as 
2 -dr 2 2 
ds? = — —__—__ -rdf —r sin? bdo? +de. 

1—r(8IIp) 
The coefficients of this metric will have the appropriate Minkowskian 
signature only for 7° less than —1/(8IIp). In other words, the radius of 
curvature R of the whole world is 


F? = 1/—8IIp. 


Indeed, that is the only value which satisfies the general condition of a 
spatially finite universe. This is also the fullest expression of the problem, 
since the right-hand side of this equation should be positive like R?. The 
cosmological constant À enables us to make the right-hand side positive: 


R? = 1/(A—8Ip). 


372 Pierre Kersaberg 


The fact that Einstein takes the pressure to be zero is not really an accident, 
because the radius of curvature now becomes 


FE = 1f}. 


With a zero pressure, the value of the cosmological constant is not confined 
a priori to a particular set of values. This tallies with Einstein’s early doubts 
about the physical nature of the constant; in a subsequent paper ([1919]), 
he tried to show that Å is comparable to a mere constant of integration and 
that it has nothing to do with a constant of nature. In doing so, Einstein is 
certainly following de Sitter’s criterion of freedom for the values of such 
constants in general relativity. Yet, the relation of A to R remains, of course, 
very puzzling indeed—de Sitter himself saw R as a purely mathematical 
quantity, justified by some philosophical need which does not of necessity 
pertain to the theory of general relativity. In trying to assert both the 
physical consistency of relativistic cosmology and its superiority over New- 
tonian cosmology, McVittie is very much to the point when he concludes 
that ‘the cosmical constant of general relativity does not give rise to a force 
of repulsion proportional to distance in Newtonian theory but manifests 
itself by the appearance of a trivial additive constant in the pressure’ (p. 
180). 

Of course, Milne himself had shown in the meantime that the cosmo- 
logical constant is not necessary for the prediction of motion in a New- 
tonian universe, just as Friedman did in the case of a relativistic universe. 
The difficulty raised by the existing parallel, which had been deemed to 
convey a true equivalence, is thus of a quite general order: what does 
dynamic cosmology add to the underlying assumptions of early, pre- 
dynamic cosmology? 

In 1917, Einstein discussed the Newtonian model solely on the grounds of 
its unsuitability for predicting the definite size (periphery) of the universe, 
whether finite or infinite. Yet, because of the requirement of static equi- 
librium, the cosmological constant could appear as if it were the bridge 
between the ‘infinite extension of central space’, in the Newtonian model, 
and the ‘self-contained continuum of finite spatial volume’, according to 
the relativistic concept ([1917], p. 180). Einstein can see clearly enough that 
the crux of the difference between the two models comes from their object 
and purpose rather than from the formal nature of A: while Newtonianism 
constitutes a view of the centre, relativistic cosmology involves a leaning 
towards the periphery. While still ignoring all of the non-static forms of the 
metric, Weyl has contributed most to our understanding of the reversal 
which this decision implies, albeit by showing that / itself has the effect of 
closing space rather than simply making possible the quasi-static dis- 
tribution of matter. But the advent of the non-static forms, together with 
the denial of the cosmological constant as necessary condition, tends to blur 
Einstein’s original emphasis. Thus, the equivalence as worked out by Milne 
is ratified by the apparent failure of relativistic cosmology to predict a 


Equivalence Between Newtonian and Relativistic Cosmology 373 


definite size for the universe. The theoretical postulation of equally possible 
universes proves the ultimate choice not to be dependent upon some kind of 
absolutely objective agreement between theory and observation. In Milne’s 
mind, this fact alone condemns the relativistic theory as too loose and, if 
left to its own devices, sorely in need of the greater stringency which he 
thought he had found. 

At this point, Milne’s formulation of the concept of an expanding cosmic 
cloud acts as the cornerstone of the whole question. In replacing extension 
with expansion, the dynamic picture of the universe erases the salient 
distinction between centre and periphery; the expansion proceeds from a 
centre which is not only the origin but also (and for that reason) the 
condition of the periphery. In the fullest sense of the term, the relativistic 
version cannot be written off as failed cosmology in that each model or 
solution taken in isolation does prescribe some determinate size of the 
universe. For instance, the Einstein—de Sitter model is compatible with a 
definite, albeit an infinite, Euclidean geometry. Milne’s criticisms have 
their pertinence at precisely that level: space so conceived cannot be a part 
of physics at all. In consequence, his aim of showing the equivalence 
between the two cosmologies, so far as the observable part of the universe 
is concerned, seems to be based on a notion of observability which is very 
far from any theory of relativity. In Milne’s own theories as well as in his 
criticism of traditional cosmology, observability is taken as the defining 
quality of a cosmological model or construct. Milne repeatedly asserts that 
the relativistic phenomena of fresh particles entering an observer’s field of 
view is no different from the creation of matter within experience ([1935], 
p. 9 ¢.g.). One is reminded here of Eddington’s far-reaching formulation: 
as early as 1923, he wrote about the horizons in cosmology that ‘it is 
impossible to know whether to blame the world-structure or the inap- 
propriateness of the coordinate system’ (p. 165). Kinematic models always 
exhibit within the field of view of any particular observer, at any epoch 
of observation, all the particles already in existence. Clearly, the actual 
overlapping of causality and observability neutralises the kind of concern 
for limitation which Eddington has in mind, because this overlap is a quality 
of the model, not of the universe. At the other extreme, the relativists would 
see observability as the very subject matter of cosmology, identifying the 
universe as it must be in itself with what may be seen happening in it. 
Thus, the relativistic response to the tension between theory and observation 
created by the rise of dynamnic cosmology is to say that it is observation, 
not theory, that needs greater stringency. For instance, Layzer holds the 
view ‘that the defining properties of the universe have the status of natural 
laws’, in the sense that this ‘encourages us to construct theories that are 
especially vulnerable to observational disproof’ ([1967], pp. 237-8). 
Accordingly, Layzer’s argument to the effect that Newtonian and rela- 
tivistic cosmology cannot be equivalent in the case of an infinite universe 
has the effect of reinforcing this specific kind of ‘definition’ of what the 


374 Pierre Kerszberg 


universe ‘is’. Along with such a definition, and given the apparent impossi- 
bility of confirming the theory of kinematic relativity by decisive obser- 
vational evidence, Milne’s ideas have almost never been analysed in terms 
of their most relevant features. A characteristic criticism is Max Born’s 
account of the theory as exhibiting an obvious discrepancy between a 
realistic idea and Milne’s operational concept of light signals, where the 
light signals are supposed to travel over vast distances which separate stars 
or even galaxies ([1943], pp. 40-1). In fact, Milne’s ‘realistic idea’ is an a 
priori assumption, very much akin to a thought experiment which cannot 
be dismissed on the grounds of its unreality. 

Now, it is particularly the paradigm of the finite universe which reveals 
the pertinent difference between the two cosmologies. A determinate size, 
like a finite one, is in absolute opposition to the arbitrarily large universe 
envisaged by McCrea, where the proportion of non-typical observers 
located near the periphery is declared to be minimal in any case. It was 
primarily his refusal to fall into such a confusion which prompted Einstein’s 
drive towards a new cosmology in the first place. If McCrea’s small number 
of atypical observers is to be completely passed over in order to preserve 
a positive meaning for the equivalence, then the emphasis on what is 
observational as the only possible plank for a scientific cosmology does no 
better justice to Einstein’s original breakthrough than it does to Milne’s 
wrestling with the ghosts of theory. By determinate size we should under- 
stand the notion of a determinate totality, in the sense of an epistemological 
concept the foundation of which is sound quite irrespective of any empirical 
evidence either way. What Einstein originally meant by the whole universe 
was based on the belief in the impossibility of making a comparison of the 
whole with any merely local extension of a given metric.' A constant pres- 
sure, as we have seen from Einstein’s own first groping steps, whether 
zero or any other value, demands a determinate size for the universe in 
accordance with the original construction of a spatially finite totality. When 
boundary conditions are abandoned in Newtonian theory (for instance by 
recalling that the matter lying outside a large spherical volume has no 
gravitational effect on the matter inside it), neither the definite structure 
nor the definite size of the material universe can be inferred. Einstein was 
certainly quite consistent when, in his 1917 paper, he did refuse to consider 
the equivalence in terms of the elimination of boundary conditions; this 
elimination is endemic to general relativity, since it articulates the idea of 
a universe of determinate size. 


1 Still in 1917, de Sitter discovered the puzzling existence of an empty solution of Einstein’s 
cosmological equations. But he based his solution on the Schwarzschild interior metric 
supplemented by the cosmological constant. This allowed for a kind of substitution of the 
constant for the density of matter in the expression of the radius of curvature. The non- 
statical form of the de Sitter solution dispenses with this substitution. Significantly, the 
original form remains today as the basis of an entirely different type of cosmology, that of 
the steady state (see Weinberg [1972], pp. 385-92 and pp. 459-60). 


Equivalence Between Newtonian and Relativistic Cosmology 375 
g CONCLUSION 


Over and above these considerations, the originality of Milne’s early paper 
remains unassuageable. By injecting into his interpretation of general rela- 
tivity his own philosophical inklings about the physical reality of space, 
Milne has not only paved the way for equivalence, he has also pointed to 
the major conceptual difficulties involved in all relativistic images of an 
expanding universe. It is no doubt true that all manner of thoughts about 
space as an a priori entity tend to erode relativistic cosmology and deprive 
it, if not of all validity at least of any pre-eminence among theories. Now, 
in the case of the dynamic universe, it is clear that the Robertson—Walker 
metric implies the geometry of a universe which is fixed in large part quite 
independently of the theory of gravitation; the conspicuous example of the 
Einstein-de Sitter model shows unambiguously enough the amount of 
independence from geometry with respect to other variables, in the sense 
that the curvature is not directly apprehensible. This makes the status of 
relativistic cosmology as a distinctive theory of the universe quite precari- 
ous, as Milne understood well. The resulting equivalence to a sophisticated 
form of Newtonian cosmology demonstrates the role of convention in both 
world-pictures, as far as the fixation of that tricky entity space is concerned. 

But there is more to come. Milne’s alleged equivalence forces relativistic 
cosmology to exhibit another of its basic ambiguities, t.e., time. What Milne 
first shows in the Einstein-de Sitter model, and later (with McCrea) in all 
other varieties of relativistic world model, is that the equations describing 
the behaviour of a particle with fixed (co-mobile) coordinates are formally 
identical with the Newtonian equation describing the distance of a particle 
from the origin as a function of time. In the Newtonian case, the time 
indicated by a clock in motion accompanying a given particle is the same 
as indicated by the clock of any distant observer—it being assumed that the 
ordinary definition of simultaneity with the help of light signals is appli- 
cable. In the relativistic case, it is clear that the cosmic time of any event 
does not coincide with the epoch allotted to it by a distant observer using 
the same definition of simultaneity. It is essentially the assumption of 
homogeneity of the universe, as it is developed in relativistic cosmology, 
which in itself suggests the equivalence. Milne defines homogeneity by 
referring to two particles of an homogeneous distribution ([1935], p. 61). 
Let us call them P and Q: the density distribution is homogeneous if, for 
any pair (P,Q), the density in P is the same as the density in Q, that is if 
p(P) = p(Q). In the case of non-static homogeneous systems, the situation 
is far from simple. Suppose an observer O of such a system measures the 
density distribution of particles, at an arbitrary point P at the epoch t of 
the event E in P. He finds p = p(P,t). From the point of view of the 
experience of O, the system will be homogeneous if the density in P at 
epoch ż is equal to the density in another point, Q, at the same epoch t of 
the event in Q. Thus, for O, p(P,t) = p(Q,t). Another observer of the 


376 Pierre Kerszberg 


system, let us say O’, in motion with respect to O, will allot to the events 
Ep and Eg in P and Q at time t of O two different epochs tp and tg. Thus, 
O and O’ will find equal densities in P and Q, but at different times; a 
homogeneous system for O will not be so for O’. Of course, the reason is 
that no objective simultaneity exists between two events in two different 
places for two separate observers. Relativistic cosmology overcomes the 
difficulty by postulating another time, T, instead of t. This new time denotes 
the proper time elapsed at each particle from a common origin. A system 
is homogeneous if, for any pair of points P and Q, p(P,t) = p(Q,t) where 
p is computed by the observers in P and Q at the epochs indicated by 
their own clocks at these points. This is justified by Weyl’s Principle. Its 
definition is given by Bondi in the following terms: “The particles of the 
substratum (representing the nebulae) lie in space-time on a bundle of 
geodesics diverging from a point in the (finite or infinitely distant) past’ 
(p. 100). All geodesics of the substratum intersect only once, at the very 
moment when they define the zero of time (see Whitrow [1980], pp. 290— 
1). This allows all clocks carried by the fundamental particles to be 
synchronised. Milne thinks this homogeneity provides quite a conventional 
definition, since it looks as if the previous difficulty has been simply 
reversed. Instead of having O’ measuring equal densities at different times 
in two points P and Q, we have a density which changes from point to 
point, different in P and Q at the same epoch of the experience of O’. In 
Milne’s eyes, this situation suggests, despite the numerous claims that the 
facts of the expanding universe are simply expressed by Weyl’s Principle, 
that the homogeneity assumption can have no counterpart in ‘reality’, i.e., 
that the cosmological principle cannot be a law of nature. Even if the 
absence of objective simultaneity is taken as something to be overcome.at 
any cost, it is certainly more natural, Milne says, to adopt a straight- 
forwardly Newtonian concept of time and, along with it, the Newtonian 
theory as the sound basis of a cosmology. This is what Milne himself 
endeavoured to do after he had established the equivalence, and while he 
was arduously engaged in the process of formalising kinematic relativity 
(see [1944]). The main idea is that the Newtonian, expanding universe 
implies that acceleration is relative, in the sense that it is not seen as absolute 
but as arbitrary, depending on the galaxy which is taken as reference system. 
In fact, the additional requirements of unboundedness, velocity-distance 
proportionality and c as absolute limit are nothing less than the basis for 
the theory of kinematic relativity, where the notion of equivalent observers 
is substituted holus bolus for homogeneity. 

On the whole, the equivalence between Newtonian and relativistic cos- 
mology only reinforces the conviction that cosmic time is indeed a necessary 
ingredient in the formalisation of a relativistic cosmology, however alien to 
general relativity and congenial to Newton’s theory the notion of universal 
synchronisation might seem. The common origin postulated by Weyl’s 
Principle is clearly not an unambiguous, ‘operational’ definition of cosmic 


Equivalence Between Newtonian and Relativistic Cosmology 377 


time, as the very terms used by Bondi reveal. One has no hint as to the 
meaning of finite past or of an infinitely distant past, t.e., whether the 
common origin is a datable event (in the astrophysical sense) or some sort 
of chronogeometrical source. The fact that it does not, after all, matter 
where observations are concerned, reflects the conventional part of rela- 
tivistic cosmology. 

Accordingly, when the equivalence is worked out on the basis of local, 
observable identity between the two theories at issue, both Milne’s original 
intention and the original relativistic line of thought tend to be overlooked. 
The history of the problem, as sketched here, shows that the alleged equiva- 
lence and its criticisms stem from great confusion. Einstein’s work is domi- 
nated by Machian conceptions, Layzer’s by a clinging to orthodox relativity, 
Milne’s by kinematic relativity, McCrea’s by rather too strict Newton- 
ianism. The central point is Milne’s conception of a form of simultaneity 
in the examination of the problem of distribution versus motion. This point 
was overlooked by Layzer, yet problems of interpretation circle round 
Layzer’s objections. It is helpful to be reminded that relativistic cosmology, 
too, to quote Herbert Dingle, ‘gives us in the same formula the structure 
of a system and the motions occurring in it’ ([1955], p. 165). The difficulty 
is concomitant with the emergence of an apparent centre from which the 
distribution and the motions occur together. This centre is pictuied in 
terms of a transcendental common origin according to Weyl’s Principle, 
and McCrea has reintroduced an element of pure ideality in his sketch of 
Newtonian cosmology by way of the immaterial inertial system. At any 
rate, the operational definition of the centre cannot be a problem pertaining 
to Newtonian cosmology. A reversion to that problem, in the manner of 
Schiicking and Heckmann, simply expresses the fact that the intriguing, yet 
decisive properties of the centre have tended to overshadow the nature of 
a definite periphery. Of course, the most recent and up-to-date ideas about 
the nature of physical science carry with them an all too impressive quantity 
of arguments in favour of the distinction. There is no point in denying that 
the promised unification of all aspects of physics under the guidance of 
a comprehensive model of the very early, centre-like universe plays the 
dominant role in this latent move. In place of the transcendental origin 
postulated by Weyl’s Principle there has been gradually substituted a sat- 
isfactory physical theory which is supposed to contain the key to unification. 

Asa result, and by way of mere analogy, the alleged equivalence between 
the two cosmologies sometimes seems reminiscent of early attempts to 
challenge the genuine originality of the Copernican system of the world, as 
opposed to the Ptolemaic. The tendency to neutralise Copernicus’ origi- 
nality always stems from an unchanging consensus about observations; its 
superiority could be granted only in offering a heuristic means by which 
Newton’s physics could be established. Just as this tendency brushes over 
the roots of Copernicus’ work, which were a radically different conception 
of the universe, the contemporary observational grounds for the equivalence 


378 Pierre Kerszberg 


between Newtonian and relativistic cosmology overlook both Einstein’s 
radical way out of Newton’s cosmology and Milne’s original model. In the 
case of Milne, the originality of the model cannot be appraised without 
relating it to the originality of his thought in general. For he did not see 
cosmology as the test case for the validity of some particular physical theory, 
but rather he refused to believe that he would impair the scientific status 
of cosmology if he first reduced all forms of possible theories about the 
universe (including his own) to their ground of equivalence, whereby the 
common, insuperable elements of conventionality would be eradicated. The 
reduction was meant to pave the way for an entirely new type of question, 
pertaining to the bases of physical science, such as: ‘Why do the laws of 
dynamics hold good at all? ([1948], p. 11). Milne starts ‘with a set of 
definitions which lead to theorems closely corresponding to laws empirically 
observed’. The correspondence of the definitions with the real, rather than 
the degree of their quality of being that Henri Poincaré called (in French) 
‘commode’, is the real issue of conventionalism. With this type of question, 
the whole traditional way of mathematising nature ‘debars itself at the 
outset’. By virtue of its all-encompassing object, cosmology should tend to 
provide a very peculiar set of definitions, since they must ‘define abstract 
entities which are the counterpart of entities existing in Nature’. Finally, 
conventionalism leads to its own reversal: ‘We come back to the Platonic 
doctrine of ideas.’ 

Not surprisingly, Milne’s ideas on the equivalence have been little under- 
stood, since they have met with both success and distortion. The reason 
for the distortion is twofold: Implicitly Milne applies concepts borrowed 
from his own theory of the universe, and this tends to conceal the originality 
of both from the eyes of those who investigate different types of implication 
in the newly invented Newtonian cosmology. That the equivalence has 
been successful, after all, is indicative of the fact that the solution to the 
problem of an expanding universe cannot simply be the bare ernpirical 
verification that one of the models predicted by or compatible with general - 
relativity actually does exist. 


University of Sydney, Australia 


REFERENCES 


Bonn, H. [1960]: Cosmology. Cambridge University Press (1st ed. 1952). 

Born, M. [1943]: Experiment and Theory in Physics. Cambridge University Press. 

CALLAN, C., DICKE, R. H. and PEEBLES, P. J. E. [1965]: ‘Cosmology and Newtonian Mech- 
anics’, American Journal of Physics, 33, pp. 105-8. 

CHARLIER, C. V. L. [1908]: ‘Wie Eine Unendliche Welt Aufgebaut Sein Kann’, Arkiv för 
Math. Astr. och Fys., 4, no. 24. 

CHARLIER, C. V. L. [1922]: ‘How an Infinite World May Be Built Up’, Arkto för Math. Astr. 
och Fys., 16, no. 22. 

COHEN, I. B. (ed.) [1958]: ‘Newton’s Four Letters to Bentley’, in Newton’s Papers and Letters 
on Natural Philosophy, pp. 280-312. Harvard University Press. 


Equivalence Between Newtonian and Relativistic Cosmology 379 


DE SITTER, W. [1916]: ‘On the Relativity of Rotation in Einstein’s Theory’, Proc. Kon. Akad. 
Wetensch. Amsterdam, 19, pp. 527-32. 

DE SITTER, W. [1921]: ‘On the Possibility of Statistical Equilibrium for the Universe’, Prac. 
Kon. Akad. Wetensch. Amsterdam, 23, pp. 866-8. 

DE SITTER, W. [1932]: Kosmos. Harvard University Press. 

DINGLE, H. [1955]: ‘Philosophical Concepts of Cosmology’, Vistas in Astronomy, 1, 
pp. 162-6. 

EDDINGTON, A. S. [1914]: Stellar Movements and the Structure of the Universe. London: 
Macmillan. 

EDDINGTON, A. S. [1923]: The Mathematical Theory of Relativity,, Cambridge University 
Press, 

EINSTEIN, A. [1917]: ‘Cosmological Considerations on the Theory of General Relativity’, 
transl. Perrett and Jeffreys in The Principle of Relativity, pp. 177-88. Dover, 1952. 

EINSTEIN, A. [1919]: ‘Spielen Gravitationsfelder im Aufber der Materiellen Elementar- 
teichen Eine Wesentliche Rolle?’ Sitz. Ber. Preuss. Ak. Wiss. Berlin, p. 354. 

EINSTEIN, A. [1933]: ‘Sur la Structure Cosmologique de l Espace’, French transl. M. Solovine. 
Paris: Hermann. 

EINSTEIN, A. [1957]: Relativity. A Popular Exposition (ist ed. 1916), transl. R. W. Lawson. 
Methuen. 

EINSTEIN, A. and DE SITTER, W. [1932]: ‘On the Relation Between the Expansion and the 
Mean Density of the Universe’, Proc. Nat. Acad. Sci., 18, pp. 213-14. 

FRIEDMANN, A. [1922]: ‘Uber die Krümmung des Raumes’, Zeitschrift fir Physik, 10, 
Pp. 377-87. 

HARRISON, E. R. [1981]: Cosmology, The Science of the Universe. Cambridge University Press. 

HECKMANN, O. and SCHUCKING, E. [1959]: ‘Newtonsche und Einsteinsche Kosmologie’, 
Encyclopedia of Physics, S. Fligge (ed.), vol. LIII, pp. 489-519. Springer-Verlag. 

HETHERINGTON, N. S. [1973]: “The Delayed Response to Suggestions of an Expanding 
Universe’, Journal of the British Astronomical Association, 84, pp. 22-8. 

HUBBLE, E. [1929]: ‘A Relation Between Distance and Radial Velocity Among Extra-Galactic 
Nebulae’, Proc. Nat. Acad. Sci., 15, pp. 168-73. 

KERSZBERG, P. [1986]: “The Cosmological Question in Newton’s Science’, Osiris, znd series, 
2, pp. 69-106. 

Layzer, D. [1954]: ‘On the Significance of Newtonian Cosmology’, Astronomical Journal, 59, 
pp. 268~70. 

LAYZER, D. [1967]: ‘A Unified Approach to Cosmology’, in Lectures in Applied Mathematics. 
J. Ehlers (ed.), vol. 8, pp. 237-58. Providence: American Mathematical Society. 

MAXWELL, J. C. [1952]: Matter and Motion. Dover (1st ed. 1887). 

McCrea, W. H. [1955]: ‘On the Significance of Newtonian Cosmology’, Astronomical Journal, 
60, p. 271-4. 

MCcVITTIE, G. C. [1954]: ‘Relativistic and Newtonian Cosmology’, Astronomical Journal, 59, 
pp. 173-80. 

MILNE, E. A. [1933]: “World-Structure and the Expansion of the Universe’, Zeitschrift für 
Astrophysik, 6, pp. 1—90. 

MILNE, E. A. [1934]: ‘A Newtonian Expanding Universe’, Quarterly Journal of Mathematics, 
5» Pp. 64~72. i 

MILNE, E. A. [1935]: Relativity, Gravitation and Worid-Structure. Clarendon Press. 

MILNE, E. A. [1944]: ‘On the Nature of Universal Gravitation’, Monthly Notices of the Royal 
Astronomical Society, 104, pp. 120-35. 

MILNE, E. A. [1948]: Kinematic Relativity. Clarendon Press. 

MILNe, E. A. and McCrea, W. H. [1934]: ‘Newtonian Universes and the Curvature of 
Space’, Quarterly Journal of Mathematics, 5, pp. 73—80. 

NEUMANN, C. [1896]: Allgemeine Untersuchungen Uber das Newtonsche Prinzip der Fernwir- 
kungen. Leipzig: Teubner. 

NEWTON, I. [1934]: Mathematical Principles of Natural Philosophy. Transl. A. Motte, rev. F. 
Cajori. University of California Press. 

NORTH, J. [1965]: The Measure of the Universe. Clarendon Press. 

Paul, W. [1958]: The Theory of Relativity. Pergamon. 

RINDLER, W. [1977]: Essential Relativity. Special, General and Cosmological. Rev. 2nd ed. 
Springer-Verlag. 

ROBERTSON, H. P. [1933]: ‘Relativistic Cosmology’, Rev. Mod. Physics, 5, pp. 62—90. 


380 Pierre Kerszberg 


ROBERTSON, H. P. [1935] and [1936]: ‘Kinematics and World-Structure’, Astrophysical Jour- 
nal, 82, pp. 284-301; 83, pp. 187-201; 83, pp. 257-71. 

RUSSELL, B. [1937]: The Principles of Mathematics. 2nd ed. George Allen and Unwin. 

SCHUCKING, E. [1967a]: ‘Newtonian Cosmology’, Texas Quarterly, 10, pp. 270-4. 

SCHOCKING, E. [1967b]: ‘Cosmology’, in Lectures in Applied Mathematics. J. Ehlers (ed.), vol. 
8, pp. 218-36. Providence: American Mathematical Society. 

Scrama, D. [1973]: Modern Cosmology. Cambridge University Press. 

SEELIGER, H. von [1895]: ‘Uber das Newton’sche Gravitationsgesetz’, Astr. Nachr., 137, 
PP. 129-33. 

SELETY, F. [1922]: ‘Beiträge Zum Kosmologischen Problem’, Annalen der Physik, 68, 


PP. 281-334. 
SKLAR, L. [1976]: ‘Inertia, Gravitation and Metaphysics’, Philosophy of Science, 43, pp. 1— 


23- 

SMITH, R. W. (1982]: The Expanding Universe. Astronomy’s Great Debate 1900-1931. Cam- 
bridge University Press. 

SYNGE, J. L. [1937]: ‘On the Concept of Gravitational Force and Gauss’ Theorem in General 
Relativity’, Proc. Edinb. Math. Soc., 2nd Series, 5, pp. 93-102. 

TORRETTI, R. [1983]: Relativity and Geometry. Pergamon. 

WEINBERG, S. [1972]: Gravitation and Cosmology. John Wiley. 

WESTFALL, R. S. [1973]: Science and Religion in Seventeenth Century England. Michigan 
University Press. 

WEYL, H. [1921]: Space, Time, Matter. Transl. H. Brose on the fourth German edition (1st 
ed. 1918), Methuen. 

Wey, H. [1923]: ‘Zur Allgemeinen Relativititstheorie’, Physikalische Zeitschrift, 24, 
pp. 230-2. 

WHITROW, G. J. [1980]: The Natural Philosophy of Time. Clarendon Press. 


Brit. J. Phil. Sci. 38 (1987), 381-387 Printed in Great Britain 381 


Discussions 


A NOTE ON THE EVOLUTIONARY THEORY OF SOFTWARE 
DEVELOPMENT 


The invention and improvement of computational software is a process 
that has come to play an increasingly important part in the ferment of 
contemporary culture. It is now a conspicuous factor in the progress of 
industry, commerce, science and scholarship. Accordingly, if it exhibits 
any pervasive patterns or regularities, we should do well to recognise them. 
Not only may our economic decisions and planning strategy profit therefrom. 
But also, to the extent that the explanatory theories of contemporary psy- 
chology exploit analogies with computational processing, any pervasive 
features of software development for implementation in artefacts may be 
indicative of analogous features in human intellectual history. So, to the 
insight afforded by computational models in the synchronic study of the 
contemporary human mind, it may be possible to add a further insight— 
an insight into the diachronic dynamics of intellectual progress. More 
specifically, if the evolution of computational software follows a dis- 
tinctively Darwinian pattern, as has been alleged in various ways by Simon 
[1970], Weinberg [1985], and Lehman and Belady [1985], some con- 
firmation may be derivable for the otherwise rather ill-supported thesis, 
favoured by Popper [1972] among others, that the progress of human 
knowledge is accomplished by a process of Darwinian evolution in the 
world of ideas. On the contrary, however, it turns out that the kinds of 
argument in Cohen [1985] which suffice to destroy Popper’s thesis are also 
effective against the Darwinian account of software development. 

Lehman [1985a] has discerned several different scales on which the pres- 
sure of environmental events may affect the evolution of computational 
programs. The first scale of change operates over different generations of 
hardware. The second is said to operate through the release of new programs 
or of alterations to existing ones. In the early operational life of a system 
release-intervals of the order of a single month are said to be common, 
while as the system ages, complexity increases and greater effort has to be 
put into controlling complexity, so that release interval may eventually 
stretch into two or three years. Other scales of change operate over yet 
narrower intervals, affecting individual steps of design and validation. So 
the fact that programming processes ‘yield development via series of 
changed steps’ satisfies standard dictionary definitions of ‘evolution’. But 
at the same time users’ demands, economic costs, or changes in the nature 


382 L. Jonathan Cohen 


of the data create environmental pressures that select between the mutations 
that are fed into the situation by the chance perceptions and varying ingen- 
uity of individual programmers. Thus programs are claimed by Lehman 
and Parr [1985] to adapt to their environment and the fittest are said to 
survive: ‘In its most general terms, the unending development of programs 
is exactly analogous to the evolutionary process that governs the life cycle 
of any complex system.’ Correspondingly one should not think of the 
successful operation of a large software system ‘as a straightforward 
implementation of some preconceived design’, any more than Darwinians 
so think of some well-adapted biological system. Rather, we are told, ‘It is 
of necessity a process that continuously modifies and improves parts of the 
code to reflect changes in the environment, shifts in objectives and advances 
in technology.’ And in this respect there is an obvious difference between 
the maintenance problems presented by hardware and software, respec- 
tively. As Belady [1985] puts it, ‘While hardware physically deteriorates due 
to wear, corrosion or fatigue, and repair is then defined as the replacement of 
the component in order to bring the system back to its original state, the 
elimination of software malfunction is performed by changing away from 
designed or constructed state.’ Moreover we can certainly notice one respect 
in which a Darwinian analysis relates to the process of software development 
rather like the way in which it relates to the progress of theoretical science. 
In the case of software development the closest analogue of a Darwinian 
species is presumably a program, while the analogue of a Darwinian popu- 
lation is constituted by the totality of the program’s implementations. In 
the case of a theoretical science the analogue of a species is most plausibly 
regarded as a theory-type, while the analogue of a population is constituted 
by the totality of its theory-tokens that are actively in use (e.g., through the 
consultation of the individual volumes in which these printings occur). 

But when we look a little closer these analogies soon begin to break down. 
The trouble is not that software changes (or new scientific theories) are not 
the result of genetic mutations or manifested in physiological adaptations. 
How could they have such specifically biological features? The trouble is 
rather that even at the level of abstraction at which the parallelism should 
be maintained it nevertheless ceases to exist. At least four such disanalogies 
deserve to be noticed. 

First, Darwinian speciation occurs via a large number of mutually inde- 
pendent events in which natural selection operates on individual members 
of a population. But in software development it may well be one particular 
implementation that initiates a change and thus helps to trigger off identical 
implementations in a wide range of separate locations: t.e., typically the 
change has a central source, from which it is disseminated to wherever local 
conditions are favourable (and the same is true in scientific progress). 

Secondly, in Darwinian speciation the processes that bring about 
mutation are decoupled from those that bring about natural selection. But 
in the computational world the economic pressures, changes in data, etc., 


A Note on the Evolutionary Theory of Software Development 383 


that favour the implementation of selected new programs or of selectively 
modified old ones, are also the factors that cause research to be done on 
new programs or on the modification of old ones. So, as with attempts to 
fit the progress of science to the Darwinian model, a core feature of Darwin’s 
analysis is missing. Darwin tried to show how species could evolve without 
any operation of rational design or purposive activity, and for this task 
the decoupling of mutation from selection was essential. But in software 
development, as in the progress of science, intelligence and rationality are 
at a premium, 

Thirdly, the process of software development lacks any close analogue 
to the probabilistic operation of a gene-pool. The co-implementation of 
two combinable programs (or the conjunction of two mutually consistent 
theories) normally produces the same fully determinate offspring in each 
case. Of course, there may well be surprises about the behaviour of this 
offspring. But if so the surprising behaviour will not normally vary between 
one implementation and another. 

Fourthly, such co-implementation permits a great deal of give and take 
between programs. Components found to perform useful functions in one 
program can often also be usefully incorporated into another, just as scien- 
tists in one field often exploit experimental techniques or mathematical 
procedures that have been developed in another. But cross~lineage bor- 
rowing is not an evolutionary mechanism in the biological world. 

So though the process of software development may undeniably be 
described as ‘evolutionary’ in some loose or colloquial sense of that term, 
it cannot avoid having at least four features that exclude its evolutionary 
pattern from being accurately described as Darwinian. There may well be 
certain laws, regularities or statistical invariances discernible in the process, 
as Lehman [1985b] has claimed. But the analogy with biological evolution 
is too slight to deserve any attention in our attempts to understand the 
process, and correspondingly the nature of software development provides 
no confirmation for Darwinian analyses of human intellectual history. 

One can see what motivates the evolutionary analysis. The process of 
software improvernent (and the progress of science) has to be accounted for 
without any postulation of a single, coherent, overall plan. Accordingly 
people are tempted to borrow the model that successfully accounts for 
biological speciation without invoking any grand rational design. Just as 
Darwin avoided the need to cite God’s plan for Nature, so too software 
development does not proceed as if there is a single Software Engineer 
who masterminds the whole process. Hence there is an understandable 
temptation to suppose that software improvements (and scientific ones) 
come about in basically the same way as biological ones. But this mtust be 
an incorrect explanation, because in the processes of software development 
and scientific progress very many different rational purposes and very many 
accidental factors combine or conflict so as to bring about what actually 
happens, while in Darwinian evolution no rational purposes have any sig- 


384 L. Jonathan Cohen 


nificant role to play. Look at it from the other direction. If biological 
speciation were really analogous in crucial respects to software improve- 
ment, it would come about in part because of the mutually interacting 
rational activities of numerous demigods rather than via natural selection. 


L. JONATHAN COHEN 
The Queen’s College, Oxford University 


REFERENCES 


BELADY, L. A. [1985]: ‘Staffing Problems in Large Scale Programming’, in M. M. Lehman 
and L. A. Belady (eds.), Program Evolution: Processes of Software Change, pp. 275-88. 
London: Academic Press (first published 1978). 

COHEN, L. J. [1985]: “Third World Epistemology’, in G. Currie and A. Musgrave (eds.), 
Popper and the Human Sciences, pp. 1—12. Dordrecht: Nijhoff. 

LEHMAN, M. M. [1985a]: ‘Program Evolution’, in M. M. Lehman and L. A. Belady (eds.), 
Program Evolution: Processes of Software Change, pp. 9-38. London: Academic Press 
(first published 1984). 

LEHMAN, M. M. f1985b]: ‘Laws of Program Evolution—Rules and Tools for Programming 
Management’, in M. M. Lehmn and L. A. Belady (eds.), Program Evolution: Processes of 
Software Change, pp. 247-74. London: Academic Press (first published 1978). 

LEHMAN, M. M. and BELADY, L. A. [1985]: ‘Introductory Review’, in M. M. Lehman and 
L. A. Belady (eds.), Program Evolution: Processes of Software Change, pp. 1-8. London: 
Academic Press. 

LEHMAN, M. M. and Parr, F. N. [1985]: ‘Program Evolution and its Impact on Software 
Engineering’, in M. M. Lehman and L. A. Belady (eds.), Program Evolution: Processes of 
Software Change, pp. 201~20. London: Academic Press (first published 1976). 

POPPER, K. R. [1982]: Objective Knowledge: An Evolutionary Approach. Oxford: Clarendon 
Press. 

Simon, H. A. [1970]: The Science of the Artificial. Cambridge: MIT Press. 

WEINBERG, G. M. [1985]: ‘Natural Selection as Applied to Computers and Programs’, in 
M. M. Lehman and L. A. Belady (eds.), Program Evolution: Processes of Software 
Change, pp. 85-98. London: Academic Press (first published in 1970). 


DO CONJUNCTIVE FORKS ALWAYS POINT TO A COMMON CAUSE? 


Reichenbach ([1956], p. 159) defines a conjunctive fork as an ordered triple 
of events (A,B,C) satisfying the following statistical relations: 


P(A/C) > P(A/—C) 
P(B/C) > P(B/—C) (1) 
P(A&B/C) = P(A/C)P(B/C) 
P(A&B/—C) = P(A/—C)P(B/—C) 


(where P(X] Y) denotes the conditional probability of X given Y, and —Z 
stands for the non-occurrence of Z). By a short calculation, based on the 


Do Conjunctive Forks Always Point to a Common Cause? 385 


fact that P(X) = P(X/ Y)P(Y)+PC(X/— Y)P(— Y), Reichenbach ([1956], 
p. 160; cf. Salmon [1984], p. 160 n. 2) shows that (1) entails that 


P(A&B) > P(A)P(B) (2) 


i.e., that A and B are not statistically independent. It can be readily seen 
that relations (1) remain invariant under the mutual exchange of A and B. 
Conjunctive forks are thus symmetric in their first two terms. On the other 
hand, relations (1) are not symmetric in A and C, nor in B and C. Reichen- 
bach believed\that this asymmetry of conjunctive forks has a causal sig- 
nificance. Three events, A, B and C will constitute a conjunctive fork as 
above if C is a common cause of A and B. It may indeed occur that a triple 
¢A,B,E) is a conjunctive fork, where E is a common effect of A and B. But 
such can be the case ‘only on the condition that there also exists a common 
cause C satisfying the same relations (1) with A and B’ (Reichenbach [1956], 
p. 162). 

Thus Reichenbach claims that the symmetric terms of a conjunctive fork 
must always have a common cause, which may or may not be the third term 
of the fork. An interest in this claim has been rekindled by Wesley Salmon, 
in whose book, Scientific Explanation and the Causal Structure of the World 
[1984] , the purported causal significance of conjunctive forks lends some 
support to the broader claim that statistical relations can provide, if not an 
analysis of causal connections, at any rate sufficient evidence of their exis- 
tence. According to Salmon ([1984], p. 262), ‘we have strong reason to 
believe that conjunctive forks have considerable explanatory force with 
respect to the order that exists in the universe’.' 

Salmon ([1984], p. 168) quotes an example, due to Ellis Crasnow, of a 
conjunctive fork <A,B,C) such that C is neither a common cause nor a 
common effect of A and B. In that example, however, there is a fourth 
event D which is the common cause of A, B and C. Salmon ([1980], p. 59) 
had presented Crasnow’s case as a counter-example to Reichenbach’s thesis 
about the causal significance of conjunctive forks, but he now acknowledges 
that, ‘as Paul Humphreys kindly pointed out in a private communication, 
this was an error’ ([1984], p. 167 n. 8). I shall here propose another example, 
which I think is less tame than Crasnow’s. 

Consider a lottery for which 160 tickets have been issued. Each ticket 
bears a different three-digit number d,d,d;, such that 1 < dı <8, 1 < d, <4, 
and 1 < d; < 5. There is a single prize. The winning number is determined 
by drawing one ball at random from each of the three urns, U;, U2, and 
U;, in that order. U, contains a sizable amount of balls of equal size bearing, 


1 Salmon adds at once that, as he has ‘explained in chapter 6, it appears that conjunctive forks 
cannot be characterized adequately in terms of statistical relations alone’. One might think 
therefore that the conjunctive forks which are said to possess considerable explanatory force 
are not those so neatly characterized by Reichenbach. But I have been unable to find in 
chapter 6 of Salmon’s book—or anywhere else, for that matter—a different definition of 
conjunctive forks, or an open rejection of Reichenbach’s. 


386 Roberto Torretti 


in equal numbers, the several digits allowed for d; (1 <1 < 3). Let n; denote 
the digit on the ball drawn from urn U;. n; is then the ith digit of the winning 
number. 

Eutychius Loveluck owns 16 tickets numberes 112, 122, 132, 142, 212, 
222, 232, 242, 312, 322, 332, 342, 412, 422, 432 and 442. Let A be the event 
that n; = 1 and n, #1. Let B be the event that n, = 3 and n; is even. Let C 
be the event that Mr Loveluck wins the prize.! It does not take long to see 
that the following relations hold between A, B and C: 


P(A/[C) = 1/4 > 1/12 = P(A/—C) 
P(B/C) = 1/4 > 1/12 = P(B/—-C) (3) 
P(A&B/C) = 1/16 = P(A/C)P(B/C) 
P(A&B)|—C) = 1/144 = P(A/—C)P(B/—C) 


(Note that A&B&C obtains only if the winning number is 132 and that 
A&B&—C obtains only if the winning number is 134.) 

The triple (A,B,C) thus constitutes a conjunctive fork. I do not 
know whether C can be properly described as a common effect of A and B, 
but it certainly is not their common cause. Under the circumstances, indeed, 
if events A and B come to pass, they will have one or more common causes, 
such as the order given by the lottery supervisor to draw the balls from 
each urn, or the decision to institute the lottery, or, presumably, the Big 
Bang. But if the lottery is fair, none of those common causes can be the 
third term of a conjunctive fork whose first two terms are A and B. 

To see this more clearly let us modernise the procedure by which the 
winning number is determined. Let m; be the highest digit that a ticket can 
have on the ith position of its three-digit number. (Hence, m; = 8, m = 4 
and m, = 5.) The winning number will be established by means of a Geiger 
counter placed successively, for a fixed period of time, in the neighbourhood 
of three separate chunks of some suitable radioactive material. Let g; be the 
number recorded by the counter at the end of its stay near the ith chunk, 
and let n; be, as before, the ith digit of the winning number. n; is given 
unambiguously by the twofold condition: 


1<n<m, and n =g; (mod m). (4) 


Then according to the current understanding of radioactivity, if A and B 


' As is usual with such examples, I assume that it is certain that the lottery will take place, 
that the players keep their tickets until the outcome is known, and that the rules of the game 
and the property laws will not be changed before the lottery is over. Readers who are 
immoderately fond of precision may incorporate these assumptions into the description of 
C. Or they can make allowance for the possibility that Mr Loveluck dies or resells some of 
his tickets, by redescribing C as ‘the event that Mr Loveluck or one or more of his heirs or 
successors wins the lottery’. They must then, indeed, prescribe that none of the said heirs 
or successors can obtain any tickets except from Mr Loveluck. (As it often happens, to make 
the story philosopher-proof one may have to end by making it unreadable.) 


Do Conjunctive Forks Always Point to a Common Cause? 387 


come to pass, they will not depend on or form a conjunctive fork with any 
earlier event.’ 

No matter how we arrange the lottery, the statistical interdependence of 
A and B, entailed by (3), is not due to their issuing from a common cause, 
but merely to the fact that one of the features of A is a logical consequence 
of one of the features of B (although A, of course, is not entailed by B). 
The example could be done away with by restricting conjunctive forks to 
atomic events. I wonder whether this can give comfort to someone reaching 
for a realist understanding of probabilistic causality. After all, nobody has 
yet been able to show that the class of atomic events is not empty in this 
intricate world of ours. 

The example I have presented is notoriously artificial. A slight change 
in the tickets held by Mr Loveluck will suffice to destroy relations (3) and 
with them the conjunctive fork (A,B,C). For a different choice of tickets 
held by Mr Loveluck one might still be able to produce a conjunctive fork 
by giving a different, generally much more far-fetched description of events 
A and B. Changing the total number of tickets, e.g., to 159 or 161, can have 
the effect of making insoluble the problem of constructing a conjunctive 
fork along the above lines. The fact that a conjunctive fork devoid of 
causal significance can be thus contrived with only the barest minimum of 
arithmetical ingenuity indicates, to my mind, that the statistical concept of 
a conjunctive fork that Reichenbach defined by means of relations (1) does 
not possess the philosophical importance that he assigned to it. Whether 
an homonymous non-statistical concept will have any significance depends, 
of course, on how its sponsors propose to define it.” 


ROBERTO TORRETTI 
Universidad de Puerto Rico 


REFERENCES 


REICHENBACH, H. [1956]: The Direction of Time. Berkeley: University of California Press. 
SALMON, W. C. [1980]: ‘Probabilistic Causality’, Pacific Philosophical Quarterly, 61, pp. 50— 


74- 
SALMON, W. C. [1984]: Scientific Explanation and the Causal Structure of the World. Princeton: 
Princeton University Press. 


1 That A and B do not depend on previous events must be understood to mean that for 
any such event K to which a probability greater than o can be meaningfully assigned, 
P(A/|K) = P(A), P(BIK) = P(B) and P(A&B/K) = P(A&B). Since (3) implies that 
P(A&B) > P(A)P(B), the triple (A,B,K> cannot meet the requirements of a conjunctive 
fork. 

2 I am very grateful to Paul Humphreys for his illuminating comments on the first draft of 
this note. 


Brit. J. Phil. Sci. 38 (1987), 389-418 Printed in Great Britain 389 


Reviews 


KITCHER, PHILIP [1985]: Vaulting Ambition: Sociobiology and the Quest 
Jor Human Nature. MIT Press. xi + 456 pp. Hardback (ISBN 0-202- 
11109-8.) 

FETZER, J. H. (ed.) [1985]: Sociobiology and Epistemology. D. Reidel, Dor- 
drecht. x +283 pp. Hardback [ISBN 90-277-2005-3), paperback 
(ISBN 9g0-277-2006-1.) 


During the last ten years, the infant discipline of sociobiology has made 
large claims and encountered mighty opposition. Philip Kitcher opti- 
mistically hopes that he can put an end to these controversies. To do so, he 
has written a book that is outstanding in its rigorous argument and searching 
analysis of the grander claims of human sociobiology. He brings to his task 
an impressive combination of philosophical, mathematical and biological 
expertise. It is a book that anyone who wishes to use the insights of sociobi- 
ology in the study of humans should take seriously. Unfortunately, despite 
its rigour, it never takes human sociobiology seriously. Kitcher allows that 
one day a ‘serious’ human sociobiology might be produced, but believes it 
is simply guesswork to wonder which human institutions evolution might 
be expected to favour. He considers that the general picture of gene-culture 
co-evaluation has not even begun to be properly elaborated. As a result, he 
is scathing in his attacks on what he terms ‘pop sociobiology’, as exemplified 
by the work of E. O. Wilson. The very term suggests considerable contempt, 
and all too often the force of Kitcher’s arguments is lessened by the sneering 
tone in which they are conducted. A typical comment by him comes when 
he deals with the sociobiological treatment of philosophical questions. He 
says: ‘So the pop sociobiologists go to work. Give them a wet Sunday 
afternoon and they will unriddle humanity’ (p. 396). 

Part of the trouble is that Kitcher is prejudiced against attempts to 
ground human institutions in human biology. He insists that the grave 
consequences of error in this field enforce the need for higher standards of 
evidence. He says of pop sociobiology: ‘The mistakes merely threaten to 
stifle the aspirations of millions’ (p. 435). He even manages (p. 11) to link 
the selection of children at the age of eleven for English grammar schools 
with the evils of pop sociobiology. His sustained and vitriolic attack on 
human sociobiology seems to result in the condemnation of the whole 
enterprise. Yet what he is usually doing is merely to show the promissory 
nature of much human sociobiology. It bases large claims on insubstantial 
evidence, and is a research programme that has been barely sketched out, 


390 The British Journal for the Philosophy of Science 


let alone implemented. Its weakness for spectacular and speculative gen- 
eralisation is deservedly exposed by Kitcher, but as he sometimes admits (p. 
31), ‘We might some day achieve justified conclusions about the evolution of 
some aspects of human behaviour.’ 

The strength of the book undoubtedly lies in the close examination of 
the various steps in the arguments of sociobiologists. For instance, his dis- 
cussion of the possible strategies open to each sex shows clearly how E. O. 
Wilson simplifies an extremely complex subject. It is doubtful whether 
there can be one optimal strategy for males and one for females, particularly 
when creatures can assess their own positions and the strategies of those 
around them. Similarly, the concept of dominance has to be refined much 
more than the casual references to it by sociobiologists allow. Kitcher also 
makes some trenchant comments about the simple view that all organisms 
are optimally designed in all respects. He holds (p. 226) that ‘evolution is 
not the best of all possible architects’. 

One favourite example of human sociobiogists is incest. It is claimed that 
human incest avoidance is genetically based. Kitcher however argues (p. 
205) that incest avoidance could be explicable without invoking genetic 
differences, as long as children tend to rear their own children in environ- 
ments similar to those in which they themselves were reared. The mere fact 
that those avoiding incest will leave more offspring will mean that the 
environment promoting incest avoidance will become more prevalent. Kit- 
cher also effectively points out (p. 348) that sociobiologists make little 
distinction between incest avoidance, actual disapproval of incest and the 
presence at a cultural level of an incest taboo. Despite the efforts by C. J. 
Lumsden and E. O. Wilson to allow for the influence of mind and culture, 
it becomes apparent that they have not really succeeded. Kitcher sums up 
his criticism of their book, Genes, Mind and Culture by saying (p. 394): ‘So 
indeed there are no genes, no mind, no culture. But there are lots of 
equations.’ 

Another example of where Kitcher’s rigorous style of attack pays divi- 
dends is in his analysis of Lumsden and Wilson’s ‘thousand-year rule’, 
according to which culture can have genetic effects within a thousand years 
in the course of ‘gene-mind co-evolution’. He shows there are two crucial 
assumptions, that people are not supposed greatly to modify their pro- 
pensity for using elements of culture in the light of what they see around 
them, and that the differences in success of different ‘culturgens’ have to 
be rather large. He says (p. 390): “The thousand-year rule is a theorem of 
the theory of gene-culture co-evolution, when it is applied to the evolution 
of hypothetical people of extraordinary stupidity.’ 

Kitcher has the expertise to show the irrelevance of much of the math- 
ematical apparatus that Lumsden and Wilson delight in. His last chapter, 
on the question of the relation between ethics and biology is less satisfactory. 
He attacks the ambitions held by Wilson of ‘biologicizing ethics’, and points 
out some of his confusions. For instance, Wilson manages to defend an 


Reviews 391 


emotivist meta-ethics, while claiming that increased biological knowledge 
will put us in a ‘better position’ to choose ethical precepts. In an excessively 
swift survey of moral philosophy, Kitcher espouses a Humean account of 
freedom. Issues concerning altruism, determinism, moral objectivity, and 
justice are quickly dealt with. The place of reason in morality is never 
properly discussed. In this chapter, Kitcher has definitely tried to take on 
too much at once. 

Despite Kitcher’s doubts, many believe that sociobiology can make a 
significant contribution to our philosophical understanding. The collection 
of papers called Sociobiology and Epistemology examines the wider impli- 
cations of the discipline as well as conceptual problems within it. The 
book begins with a well-documented summary by C. J. Lumsden and C. 
Gushurst of the current position in human sociobiology. They explicitly 
distinguish between that and what in ten years has become ‘classical socio- 
biology’. The latter, they say, tried to link genes to social phenomena 
without reference to human consciousness. Their contention now is (p. 8) 
that much human culture is sustained ‘by gene-culture transmission rather 
than by pure cultural transmission’. 

The collection contains important contributions to the debate about 
sociobiology. Alexander Rosenberg, for example, argues against the attack 
mounted by Gould and Lewontin on the ‘Panglossian paradigm’ according 
to which every trait is seen to have adaptive significance. He says (p. 177): 
“There really is no alternative to adaptionalism short of surrendering the 
theory of evolution itself.’ Elliott Sober writes a defence of methodological 
behaviourism in some contexts, regarding game theory as a perfect setting 
for it. He suggests that if human beings produce a given behaviour in virtue 
of certain cognitive machinery, but members of another species produce 
the ‘same’ behaviour without such machinery, then we ought to abstract 
away from the cognitive details peculiar to our species if we want to say 
what we and the other species have in common. 

Two issues surface as being of particular philosophical interest. The 
relevance of biology to ethics continues to be discussed, and Lawrence 
Thomas tries to show that ‘our taking people seriously can be regarded as 
something we do naturally’ (p. 126). The epilogue by Michael Ruse, on the 
other hand, emphasises epistemology. He argues for a strong connection 
between science and reproductive advantage. Evolutionary epistemology 
does not just rely on the theory of evolution as an analogy for the way 
knowledge grows. It should, he thinks, maintain a real connection between 
our understanding and evolutionary advantage. Yet this appears to point a 
link between truth and advantage which may not hold. Do true beliefs 
always help their holders? May not false ones also on occasion provide 
reproductive advantage? Ruse himself refers to the survival value of 
religion, and yet clearly does not suppose that this shows its truth. Even a 
belief in the truth of sociobiology may not make us biologically fitter. Indeed 
if it shatters important illusions, say about the objectivity of morality, as 


392 The British Journal for the Philosophy of Science 


some sociobiologists suggest, the belief could eventually have disastrous 
consequences for its holders. Becoming aware of the workings of evolution 
may itself hinder those workings. 

All in all, this is a stimulating collection, and by no means all the papers 
are uncritical of sociobiological assumptions. They demonstrate that human 
sociobiology deserves to be taken seriously, and cannot be dismissed out of 
hand, however much some may wish to do so. 


ROGER TRIGG 
University of Warwick 


RESCHER, NICHOLAS [1984]: The Limits of Science. Pittsburgh series in 
philosophy and history of science, vol. 10. University of California 
Press. Pp. xii +225. £27.50. 


According to its Introduction, this book attempts to give its readers ‘‘a 
realistic understanding of the inherent limits of the scientific enterprise so 
as to prevent inflated and unrealistic expectations, and thus to avoid the 
backlash of reproach, recrimination, and alienation to which to dis- 
appointment of such unreasonable expectations could all too easily lead” 
(p. 1). 

Most of the material presented in this book has been published before. 
As the author acknowledges, chapters 2, 3, and g are taken from or based 
upon his Empirical Enquiry (Totowa, N.J., 1982); the final section of chapter 
6, and chapter 10 are based upon his Scientific Progress (Oxford, 1978), 
although the latter not, as Rescher writes, on chapter 10 of that book, but 
on chapter 11. But also some of the chapters—and I have not found this 
being mentioned in the book under review—contain previously published 
material: chapter 7 is almost identical with Rescher’s paper in the Griin- 
baum Festschrift edited by Cohen and Laudan, Physics, Philosophy and 
Psychoanalysts (Dordrecht, 1983), while chapters 5 and 8 are expanded 
versions of his contribution to the 1975 Kronberg conference, edited by 
Gerard Radnitzky and Gunnar Andersson under the title The Structure 
and Development of Science (Dordrecht, 1979). I have to note that this very 
same paper has had an earlier unacknowledged republication in Rescher’s 
book Cognitive Systematization (Oxford, 1979). 

There are no principled objections against republishing previously pub- 
lished work. Indeed, papers may profit from being put side by side. But it 
has its dangers, and the present author has not been altogether successful 
in avoiding them. The book under review repeats itself too much; moreover, 
it is debatable whether the several topics discussed between its cover form a 
natural whole: for instance, it is not clear to me that the intended readership, 
seized with disappointment, distrust, or even hatred of science, is much 


Reviews 393 


helped by a discussion of the question whether science will keep coming 
up with fresh problems. 

‘There are also some traces of hack writing in the book. On page 49, ina 
footnote, the same passage from a book by Richard Feynman is quoted 
which has already been given, in an earlier chapter, on page 43. On page 
52 the same quotation from a paper by the physicist Jean-Paul Vigier is 
given twice: once in the text, and then repeated in the footnote documenting 
the quotation. More examples could be added. One fears that the book has 
not seen an independent editor, for Rescher has published his new book in 
a series of which he is himself an editor. 

These faults are all the more deplorable since Rescher is one of the most 
reasonable philosophers of science now active in the United States. In this 
book, as in his previous publications, he tries to paint a many-faceted 
picture of science, by bringing together considerations from the logic of 
science, the history of science, and the sociology of science and adding 
remarks based on the work of his previous colleague, Larry Laudan, con- 
cerning the history of scientific methodology. Rescher is a philosopher of 
the sensible middle path who tries to find value in such divers thinkers as 
Karl Popper and Martin Heidegger. It seems to me especially when one 
gives a balanced account of things that one needs to attend closely to matters 
of formulation and style since the danger of saying the too obvious is 
considerable. 

The book under review discusses the question whether there exist theor- 
etical limits to what science can accomplish. It dismisses in chapter 1 the 
argument that since any scientific explanation has to presuppose something 
which is not explained itself, something is left out in every explanation and 
therefore lies behind the reach of science. The argument clearly involves a 
fallacious quantifier permutation. From chapter 2 on (“Question dynamics 
and problems of scientific completeness”) the book then takes a very profit- 
able turn of discussing its question as to the limits of science in terms of a 
characterization of science as an enterprise which poses questions, evaluates 
tentative answers, and on the basis of them comes up with new questions 
or dismisses old questions with their old answers as involving incorrect 
presuppositions. Three issues are raised: can scientific progress be under- 
stood in terms of changes in the set of questions science poses, and the 
subset of these that science can answer?; are there any theoretical reasons 
to believe that science will ever run out of questions to work on, having 
answered all its old questions?; are there questions that science should 
answer but that can be shown to be unanswerable by science? Rescher’s 
answers are: No, no and no. 

His first answer is directed against Popper and Laudan. Popper thought 
that later theories are better because they answer all questions answered by 
their predecessors, plus some additional ones. Against this, Rescher argues, 
using his analysis of questions and their presuppositions, that typically new 
theories make old questions senseless because they provide no support for 


394 The British Journal for the Philosophy of Science 


the presuppositions of the old questions. Larry Laudan sees progress in a 
mere increase in the quantity of answered questions: the replacing theory 
answers more questions, although not necessarily all the questions answered 
previously. For Laudan, Rescher sees problems in the individuation of 
questions, necessary before we can count them, in the fact that we still need 
a theory-independent account of the adequacy of an answer, and of the 
significance of the question. But these are not forthcoming, because we 
need a theory to judge the adequacy of an answer, and we need a change in 
theory to judge the significance of the question whose answer led to the 
new theory. Rescher’s own theory of scientific progress, which he has 
already written on in his Scientific Progress, says that it ‘‘must be taken to 
rest on its pragmatic improvement: the increasing success of its applications 
in problem solving and control, in cognitive and physical mastery over 
nature” (p. 90). Rescher believes that, thus characterized, scientific progress 
is both context-independent and probably present: “practical problems 
have a tendency to remain structurally invariant. The sending of messages 
is just that, whether horse-carried letters or laser beams are used in trans- 
mitting the information” (p. 90); and “‘no sophisticated complexities are 
needed to say that one stage in the career of science is superior to another 
in launching rockets and curing colds and exploding bombs” (p. 93). Thus 
while Rescher has a clear eye for the historical dimension where it concerns 
theory, he cannot shed his a-historical prejudices where it concerns praxis. 
Not only the specific human needs are remolded and generated anew by 
technology itself, but also the very notion of technology itself has undergone 
a change. There is no pre-given set of human needs that science satisfies 
today better than it did in earlier times. Moreover, the notion of control 
could only arise when Nature became conceived of as a machine, as some- 
thing which ‘“‘can be ordered” (das Bastellbare), which “‘is at our disposal” 
(das Verfiigbare), to use the terminology of Heidegger [1954]. But even 
given a fixed set of human needs scientific progress as conceived of by 
Rescher is not epistemologically transparent: if a technological change is 
an improvement, it is not provably so. Out best means to assess the causal 
effects of changes, the comparative randomized trial, gives us information 
only about a finite number of possible effects. A multitude of so-called side- 
effects will at any point remain hidden. Rescher’s sharp analytical mind 
remains ineffective in the treatment of technology, I feel, because he lacks 
a formidable opponent. He has quoted extensively from the older German 
literature. He should read some more recent stuff, as Kurt Hübner’s Kritik 
der wissenschaftliche Vernunft which is now available in English (Hübner 
[1983]). 

Could science ever run out of questions to work on, having answered 
all its old questions? Rescher’s answer is somewhat unclear: theoretically it 
could happen, but this answer is given ‘‘at the level of supposition contrary 
to fact” (p. 138), due to what he calls Kant’s Principle of question-propa- 
gation. He points out that if it happened we could not have warrant to 


Reviews 395 


believe it, because we would need to know that we have covered all possible 
physical interactions. ‘““The idea of a consolidated erotetic completeness 
shipwrecks on the infeasibility of finding a meaningful way to monitor its 
attainment” (p. 139). But his final answer is that this could not happen due 
to the dynamics of scientific questioning and answering. Rescher has many 
laudable things to say, but I can make here only two critical comments. 
Rescher attaches the name of Kant to the following principle of question 
propagation: ‘““The answering of our factual (scientific) questions always 
paves the way to further yet unanswered questions.” He justifies this by 
quoting one passage from Kant’s Prolegomena. In the book under review 
the quotation is very short indeed (p. 28). In Scientific Progress the same 
passage is quoted a little more extensively: 


Who can satisfy himself with mere empirical knowledge in all the cosmological 
questions of the duration and of the magnitude of the world, of freedom or of 
natural necessity, since every answer given on principles of experience begets a fresh 
question, which likewise requires tts answer and thereby clearly shows the insufficiency 
of all physical modes of explanation to satisfy reason? (Prolegomena, sect. 57; italics 
from Rescher) 


My point is not that to invoke Kant on the basis of one side-remark he has 
made, is a kind of name-dropping which this idea of Rescher’s does not 
need. The point is that Rescher has completely misinterpreted this Kantian 
passage. Ironically, Kant’s argument is the very same argument for the 
incompleteness of science which is dismissed by Rescher in his first chapter 
as ‘‘too facile—too quick and easy” (p. 17). It is the ‘‘something-is-always- 
left-out’’ route to charges of incompleteness. A little earlier in the same 
section Kant wrote 


Experience never satisfies reason fully but, in answering questions, refers us further 
and further back and leaves us dissatisfied with regard to their complete solution. 


Kant argues for the insufficiency of scientific explanations, especially of 
causal explanations of individual facts, because they lead to an infinite 
regress of causal questions which will not “‘satisfy reason’’. 

My second critical coment concerns a similar misinterpretation by 
Rescher. On page 150 he quotes approvingly a formal argument which was 
first published by Fitch [1962] and later put to philosophical use by Routley 
[1981] to the effect that there are truths which are unknowable. 

Rescher writes that, although “this sort of argumentation for the incom- 
pleteness of knowledge is too abstract and “general principally” to carry 
much conviction in itself (. . .) it does provide some suggestive stagesetting 
for the more concrete rationale of the imperfectibility of science that has 
concerned us here”, viz., that science will never be done with its job. But 
this is not what the argument seeks to establish. Indeed, the conclusion of 
the argument should be unwelcome to Rescher. For he thinks that although 
at any point on its journey science will have unanswered questions, none 
of these questions are in principle unanswerable: there are no insolubilia, 


396 The British Journal for the Philosophy of Science 


all truths in its domain are knowable. I concede that the Fitch-argument, 
if acceptable, does not necessarily establish that there are insolubilia, let 
alone, exhibit a particular one. The unknowable truths may not belong to 
the domain of science. But the proper place to discuss the argument is 
his chapter against insolubtlia, not his discussion of the unrealizability of 
perfected science. f 

This was the third issue: are there questions that science should answer 
but that can be shown to be unanswerable by it? Rescher’s answer is, again, 
no, but his reasoning turns the answer into a tautology. We cannot expect 
science to answer questions which lie outside its domain, he argues. But 
the domain of science is characterized in terms of scientific method: 


s 


. whether there are issues that we humans would want to have resolved that 
could not be resolved in terms of scientific information about the world even if the 
scientific project were carried through to completion and perfection. Such issues, 
in short, would by their very nature lie outside the domain of science.” 


Rescher is better in discussing specific purported scientific imsolubtlia. 
There is a section on the Reymond-Haeckel controversy, and a discussion 
of the question “Why is there anything rather than nothing?” with a ref- 
erence to the work of John Leslie [1979]. In the final chapter Rescher is again 
very broad-minded: scientific knowledge is just one mode of knowledge, and 
knowledge is just one good among others. But his discussion is again 
somewhat disappointing, because he does not consider the possibility of 
conflict and incompatibilities. “We have no choice but to follow science 
where it leads us” (p. 216) (But who will “lead” science?) Knowledge is 
good since “‘knowledge is a key component of the good per se, because of 
its smooth fit within the overall economy of norms. (. . .) Whatever other 
projects we may have in view—justice, health, environmental attractiveness, 
the cultivation of human relations, and so on—it is pretty much inevitable 
that their realization will be facilitated by the knowledge of relevant facts” 
(p. 208). Knowledge may be good, but knowledge never comes for free: the 
process of obtaining scientific knowledge requires a discipline of people and 
may imperil justice, health, environmental attractiveness, the cultivation of 
human relations, and so on. 
ZENO G. SWIJTINK 
SUNY at Buffalo 


REFERENCES 


FITCH, F. B. [1962]: ‘A logical analysis of some value concepts’, Journal of Symbolic Logic, 
28, pp. 135-142. 

HEIDEGGER, M. [1954]: ‘Die Frage nach der Technik’ Reden und Aufsätze. Pfullingen. 

HUBNER, K. [1983]: Critique of Scientific Reason. The University of Chicago Press. 

LESLIE, J. [1979]: Value and Existence. Oxford University Press. 

ROUTLEY, R. [1981]: ‘Necessary limits to knowledge: unknowable truths’, in Essays in Scien- 
tific Philosophy (E. Morscher et al., eds.). Comes Verlag. 


Reviews 397 


SOBER, ELLIOTT [1984]: The Nature of Selection. Evolutionary Theory in 
Philosophical Focus. M.1.T. Press. x-+ 383 pp. (ISBN 0-262-19232-2). 


Scholars are unlikely to disagree with Professor Sober’s contention that, in 
both its explanandum and its explanans, evolutionary explanation holds 
considerable philosophical interest. Such concepts as ‘fitness’, ‘chance’, 
‘adaptation’, ‘selection’ cry out for analysis, whilst the arguments in which 
they are conventionally embedded have often been less than rigorous. Add 
the controversial, reductionist, claims of those who would confine the oper- 
ation of natural selection to the genetic level and a topical agenda soon takes 
shape. It is tackled here with unremitting vigour, resulting in a polished and 
authoritative volume which will be valued by novices and experts in the 
philosophy of evolutionary biology. 

The concept of ‘fitness’ is first scrutinised in preparation for an assault 
on that familiar charge that ‘the survival of the fittest’ is tautological. 
Explicating ‘fitness’ as a probabilistic quantity, Sober notes that actual 
survival rates do not define fitness values. But his discussion is more refresh- 
ing than that; firstly because he draws on that philosophical literature which 
has been concerned to show that the idea of apriority is itself problematic 
(making it very difficult to demonstrate that a proposition in a scientific 
theory is truly a priori); and secondly because he provides a justification 
for doubting whether the consequences for evolutionary theory would be 
deleterious, even if certain characterisations of ‘fitness’ were non-empirical. 
An explanation as a whole can still be empirical as long as some components 
of it are. The efforts of Gould and others to establish independent criteria 
for determining fitness differentials (via some kind of design-analysis) are 
also reviewed, resulting in as illuminating account of the tautology problem 
as one is likely to find. 

With reference to the ‘chance’ elements in evolutionary theory, Sober 
emphasises that, in a central sense, both gene mutation and natural selection 
are deterministic processes. He shows how the slogan of ‘random mutation’ 
can be misleading, prompting the historian of science to reflect that when 
Darwin himself used the word ‘chance’ it was sometimes to cover an ignor- 
ance of the origin of variation, sometimes to refer to the intersection of two 
otherwise independent causal chains. One wonders, indeed, whether the 
stress placed on ‘chance’ by the Darwinians might not reflect that dialectical 
process through which natural selection emerged in opposition to the design 
motifs of natural theology. Not that the concept of ‘chance’ is now redun- 
dant. It finds at least one application, as Sober shows, in the context of 
random genetic drift. 

The asymmetry between evolutionary explanation and prediction, to 
which Scriven drew attention in 1959, constitutes a further focus of the 
analysis—the author persuasively critical of both aspects of Hempel’s sym- 
metry thesis. The circumstances in which natural selection can be said to 


398 The British Journal for the Philosophy of Science 


be an ‘improver’ are also carefully analysed, with due dissociation from the 
idea that it is inherently an instrument of improvement. 

Among several distinctions made to do original work is that between 
selection of objects and selection for properties. If the smallest in a mixed 
bag of balls all happen to be green, then shaking the contents in a device, 
fitted with holes of a size which allows only the smallest to drop through, 
will result in all the green balls falling into a lower compartment. But 
the selection was for being small, not for being green. The distinction is 
mobilised in evaluating the causal claims of those who insist that all selection 
is selection for and against single genes. In evolution by natural selection 
there must be selection of single genes, but it need not follow that the 
process always involves selection for single genes. 

This is one move among many in a robust critique of Dawkins’ genic 
selectionism. Welcome attention is paid to the ‘directness objection’ (that 
selection cannot ‘see’ genes and pick among them directly, but must use 
bodies as intermediaries), and to the ‘context-dependence objection’ (that 
no gene has a fixed selective value but may confer high fitness with one 
genetic background, whilst being virtually lethal with another). Dawkins’ 
rejoinders are fairly considered, but Sober concludes that context-depen- 
dence does present serious difficulties for an exclusively genic selectionism. 
Arguments for restriction to the genic level based on a principle of par- 
simony are also deemed inadequate, since the simplicity of a hypothesis 
cannot explain why it is or is not true: ‘a full resolution of the units of 
selection controversy must tell us not only whether but why group selection 
is common or rare’. 

The reference to group selection reflects Sober’s concern to explore other 
levels at which selection has been claimed to operate. Group selection, for 
example, appears to be an alternative principle that makes the existence of 
altruism possible. He is therefore led to consider such questions as whether 
group selection can always be redescribed in terms of individual selection 
and what consequences follow if natural selection can always be represented 
in terms of genic selection coefficients. His thesis is that representability 
may not get to the causal heart of the matter which can ultimately be reached 
only by empirical means——in which case the parsimony argument that 
appeals to representability becomes ‘entirely irrelevant’ to the unit of selec- 
tion controversy. 

Disaffection with Dawkins’ arguments does not, however, lead him to an 
uncritical acceptance of selection at phenotypic and group levels. Rather 
than generalise on what the units of selection actually are, his aim is to 
extract the ‘base conception’ of what it would take to be a unit of selection. 
The upshot is a plea for toleration and an openness to what the biological 
data may tell us. In some cases he is prepared to speak of genic selection as 
true (where a genic property is a positive causal factor in survival and 
reproduction). In other circumstances, group and individual selection may 
operate, with each favouring the same characteristic, or with group selection 


Reviews 399 


increasing a character that is neither favoured nor disfavoured at an indi- 
vidual level. Throughout there is a healthy disrespect for the kind of legis- 
lation that defines group selection in a way that requires altruism, and defines 
altruism so that it cannot evolve by any sort of selection. 

The cliché that it is impossible in a few lines to do justice to the fine- 
structure of the arguments is nowhere more applicable than here. It is the 
fine-structure that is most rewarding, informed as it is by telling examples 
drawn from the biological literature, by constructive analogies lifted from 
the physical sciences, and by recent (if sometimes atypical) literature on 
Darwin himself. In short, a substantial and welcome addition to which 
reference will have to be made. 

JOHN HEDLEY BROOKE 
University of Lancaster 


ACKERMAN, ROBERT J. [1985] Data, Instruments, and Theory; A Dia- 
lectical Approach to Understanding Science. Princeton University Press. 
Pp. xii+216. £39.25 (ISBN 0-262-01083-6) 


In his latest book, Robert Ackerman tries to unite his earlier discussion of 
‘data domains’ (from his The Philosophy of Science, 1970) and of evolution 
through selection as a metaphor for the development of science (from 
his The Philosophy of Karl Popper, 1976) by converging on the use of 
instruments. There is also a discussion of the results of sociology of science 
(mainly British), and of the philosophy of sociology (mainly German). The 
aim of the book is to resolve the ‘conflict between Kuhn and more traditional 
philosophical epistemologies concerning subjectivity in science’ (p. ix). The 
book can be welcomed as drawing attention to the experimental as opposed 
to the theoretical, to tinkering instead of thinking. As such it joins recent 
work by, among others, Ian Hacking and Dudley Shaper. But, although 
the book is written in Amherst, Massachusetts, it seems to be conceived in 
ignorance, or in disdain, of most analytical philosophy of science written 
in North America, just as that literature has passed over Ackerman’s efforts. 
As the epitath ‘dialectical’ in the title of his book suggests, Ackerman has 
found refuge in the camp of ‘generalists’, now gaining the upper hand 
in some professional organisations as the American Philosophical Associ- 
ation. Indeed, his current book ‘would not have been possible without 
. . . the euphoria consequent to the victory of the transworld depraved 
philosophers’ (p. xii). 

The book has four chapters and an appendix. Chapter 1 discusses and 
rejects both empiricism and rationalism as viable attempts to ground scien- 
tific knowledge. Neither position can be closed, it is argued, in terms of its 
own assumptions. Rationalism has a problem of reference: why should 


400 The British Journal for the Philosophy of Science 


knowledge read from clear and distinct ideas be knowledge about the same 
world as studied in modern scientific experimentation? Empiricism foun- 
ders on Hume’s problem and is unable to show that paradigmatic examples 
of scientific knowledge do indeed qualify as knowledge. Moreover, both 
empiricism and rationalism are epistemologies constructed on personal 
belief structures. They are based on what is here called the Cartesian 
assumption (and what John Ziman has called the myth of Robinson Crusoe 
science) that the individual scientist is the appropriate locus of philosophical 
reflection into the nature and scope of scientific practice. To replace these 
traditional epistemologies, Ackerman sketches a theory that is both social 
and dynamic. Knowledge claims cannot be based on an individual’s fol- 
lowing a preferred set of methodological rules, but knowledge is the product 
of a social legitimation process that is always open-ended and always fallible. 
Social legitimation works by way of developing acceptable data domains 
and by their interpretation by competing theories. This process is open- 
ended since the total ‘data text’ (an expression used interchangeably with 
that of data domain) is never fixed. New experiments and new or improved 
instruments may always be developed whose data require an interpretation 
cohering with that of the previous text. The process is fallible since in- 
struments create what is called an invariant relationship between their 
operation and the world (‘at least when we abstract from the expertise 
involved in their correct use’). An instrument is said to break the connec- 
tion between theory and observation, so that data can be gathered indepen- 
dent of current theorising. ‘After change in theory, it will continue to 
show the same reading, even though we may take the reading to be no 
longer important, or to tell us something other than what we had thought 
originally’ (p. 33). 

Chapter 2 contains a critical discussion of Merton and Kuhn, blaming 
Kuhn for a disregard of the experimental side of science and for too much 
stress on ideas, thought, Weltanschauung, and Merton for a set of norms 
that does not differentiate between science and other academic disciplines 
as literary criticism. Studies using citation research are similarly unable to 
account for the differences between the physical and the human sciences. 
There is a section on the cognitive reasons for controversy, where different 
parties have staked their intellectual lives on contrary theory. Even under 
a Popperian analysis of science there would be no cognitive advantage to 
controversy (as opposed to an impartial debate of how a variety of bold 
theories fares under the presently accepted data). This is connected with 
Enrico Bellone’s discussion of a scientist’s ‘dictionary’. Controversy is 
interpreted as the process by which the significance of good work is settled. 
It is not clear, however, why Ackerman thinks that, apart for psychological 
reasons, Popperian criticism cannot do the same. 

In chapter 3 it is argued that the historical nature of science, the fact that 
only in hindsight the relevance of data can be evaluated and that the 
significance of the raw data at the growing frontier of science is often 


Reviews 401 


unproved, makes it impossible to draw a distinction between science and 
pseudoscience, between. pure and applied science. 

Chapter 4 is the longest and central chapter of the book. There is a section 
“The Social Construction of Scientific Fact’ with a valuable discussion of 
Jerome Ravetz’s classic Sctentific Knowledge and its Social Problems whose 
two-stage social legitimation process is applied to the Ehrenhaft—Millikan 
dispute as documented by Gerald Holton. The 1935 study of the Wasser- 
mann reaction by Ludwik Fleck Genesis and Development of a Scientific 
Fact and the book Laboratory Life: The Social Construction of Scientific 
Facts by Bruno Latour and Steve Woolgar receive a similar attention. 
The object of this section is to show that although scientific facts are 
constructions and do not occur at the level of an (idealised) individual 
scientist, this ‘does not mean a concession to any easy relativism’ (p. 124), 
since victorious opinions win, only because they can, and the losing ones 
cannot, be linked to ever improved experiments and measurements. The 
circle of thought and interpretation is broken. This process is at any point 
fallible, however, and truth cannot be identified with what happens to 
emerge in its course. ‘Important facts’ may remain unrecognised, but Ack- 
erman does not say what makes facts important independent of the con- 
tingencies of the social legitimation process and his notion of realism 
remains unclear. 

In ‘Data Domains’ the evolutionary metaphor alluded to earlier is 
developed in more detail. Theories are identified with biological species and 
facts with environmental niches. There is an interplay, called dialectical, 
between facts and theories, but the book does not provide a separate dis- 
cussion of what is meant by ‘dialectical’, and its meaning has to be inferred 
from its use. On page x, for instance, it is said that ‘instruments break the 
connection between theory and observation, allowing the dialectic between 
theory and observation to take place’ which suggests that an interplay 
between A and B cannot be dialectical unless A and B are relatively inde- 
pendent. But many readers would have been helped by a separate 
discussion. The metaphor suggests questions like, ‘How can data domains 
be individuated?’ (by which Ackerman seems to mean the ontogenetic 
question, like speciation) and, ‘How can data domains be defined?’ (which 
is asking for the ontological principle of individuation of data domains, like 
what are the necessary and sufficient conditions for two individuals to be 
of the same species’). But he does not always separate these two questions, 
and there are suggestions that he thinks he should not. For instance, he 
may believe that to answer the ontological question he may have to ‘regard 
either side of such an interesting pair of concepts as fixed’ which he finds 
hard to do (talking about the pair species/niche). The problem Ackerman 
wrestles with is related to, but not identical with, Dudley Shapere’s notion 
of the ‘domain of a theory’, and it is an unhappy consequence of philo- 
sophical controversy (with no cognitive advantages to be discerned) that he 
does not take advantage of Shapere’s more analytical discussion. 


402 The British Journal for the Philosophy of Science 


In “The Microprocessing of Scientific Fact’, another expression taken 
from Latour and Woolgar, Ackerman takes up the case where data and 
theory are relatively stable over time and a close adaptation can be expected, 
a phase related to what Kuhn has dubbed normal science. It is a situation 
which suggests a correspondence theory of truth in which language is ideally 
equipped to achieve a close fit with reality. I missed a discussion of the 
question how data domains and facts are related. ‘Theory and Experiment’ 
discusses the situation when consonance is destroyed by new data or new 
theory and factors external to science may play a significant role in sug- 
gesting ways to restore equilibrium. Forman’s study of Weimar Culture 
and Quantum Mechanics, not mentioned by Ackerman, comes to mind. 
Finally, there is an appendix in which the notion of data domain is used to 
illuminate the debate between Theodore Adorno, Karl Popper, and others, 
on the nature of sociology, the so-called Positivismusstreit of 1961, together 
with a discussion of mathematical economics versus historical approaches 
to economics and holistic marxism. 

I found this study stimulating reading and am impressed by its wide 
scope and the daring of its author to attack interesting and hard questions. 
But after having read it, I feel bewildered and confused as to its arguments 
and its conclusions. Ackerman seems to take a justifiable contempt for 
irrelevant formalism that mainly generates its own problems for a ticket to 
shun detailed discussion of particular instruments and the theory they 
embody and of particular data and their analysis. There is no classification 
of instruments, all of them are lumped together: the instruments to isolate 
a phenomenon (the airpump), to create a phenomenon (the cyclotron), to 
picture a phenomenon (the electron microscope), and to count or measure a 
phenomenon (the Geiger counter, the digital thermometer). The important 
notion of artifact appears once or twice in the book, but no detailed epis- 
temological discussion is given. The same is true for the necessity to have 
trained personnel use the instruments. Such discussions are essential to 
make Ackermann’s central theses plausible. 

I take the following two to be among his central theses (although, again, 
they are not well distinguished by Ackermann): (1) scientific instruments 
break the connection between theory and observation so that data can be 
gathered independent of current theorising; (ii) instruments function to 
break off the influence of assumption on personal observation; if they did 
not exist, the fact of the influence of theory on perception might mean that 
shared data would be impossible. In the following I will briefly discuss the 
case of low-dose electron microscopy which suggests that, while (ii) may 
be correct, the first thesis is less plausible. 

First however I have some small points and a remark on the evolutionary 
metaphor. What Ackermann calls the Brake Principle had already been 
advanced, before the Cartesians mentioned by him, by Giovanni Benedetti 
(1530-90), a forerunner of Galileo. Secondly, the argument that Carnap 
needs some substantive presuppositions to obtain a suitable theory of con- 


Reviews 403 


firmation is confused. The starting point of the argument, viz., that we 
have a coin which is either fair or biased towards heads with a chance for 
heads of 3/4, is incompatible with the proposed model: equal probability 
for structure descriptions over a hundred tosses and exchangeability. That 
is, there is no prior over the two chance propositions that leads to such 
probabilities. 

With respect to the evolutionary metaphor I feel that it has been sugges- 
tive, but that, when one tries to get details, it becomes almost empty. Of 
course, there has been evolution: science develops. But the mechanism by 
which science has developed is not clear at all. The theory of biological 
evolution would not be explanatory if it did not state a number of causal 
factors with which actual cases of evolution can, after the fact, be explained 
and whose reality and mode of interaction can be exhibited in designed 
experiments and by mathematical modelling that gives plausible results. 
Even if we could find a perfect fit between a theory of biological evolution 
and a theory of the development of science (by a map that identifies key 
concepts in both theories, say, theories with species) which saves the 
phenomena of scientific development, that last theory may still be unsatis- 
factory since it is unlikely that it will contain plausible causal principles. 
The kind of objects we meet in science are so different. Theories, facts, 
data domains are abstract objects without causal power. But even if we are 
not looking for anything more than an as-if interpretation, the metaphor is 
tedious because it is likely to break down at many points and therefore 
needs constant checking. Again: the objects we meet in science are so 
different. Theories never die, at least they need not, and may forever live 
in world 3. Theories may adopt elements from each other, while species do 
not commonly interbreed. And so on. 

Robustness and repeatability are the key terms in discussing the 
reliability of an instrument. Indeed, without some of it one would not speak 
of an instrument. Is robustness a theory independent notion? For instance, 
in low-dose electron microscopy used in the imaging of macromolecules as 
proteins and ribosomes—high radiation would destroy the objects—there 
is a dominating quantum (‘shot’) noise in the primary image with a very 
poor signal-to-noise ratio. Furthermore, there is the phase problem: the 
phase information is not recorded in a single exposure, and has to be 
reconstructed from several exposures taken with different settings of the 
defocus parameter of the microscope. Computer processing of electron 
microscope images becomes an indispensable tool for the evaluation of the 
structural information contained in the images. The important episte- 
mological point is that assumptions about the structure of the specimen 
validated on the basis of other kind of, say chemical, information have to 
be used in the statistical analysis and combination of the raw images before 
anything better than an almost uniform image contrast can be observed (ef. 
Slump [1984]). That is, the judgement that a low-dose electron microscope 
is a robust instrument for the observation of biological macromolecules, is 


404 The British Journal for the Philosophy of Science 


dependent on theoretical assumptions. The example is a direct refutation 
of Ackermann’s claim that after a change in theory, our instruments will 
still behave the same. 

But his second thesis still stands and is not affected by the case of low- 
dose electron microscopy, although the microscope is not a particularly good 
example. As I have documented elsewhere (Swijtink [1986]), there is a 
development in instrumentation, culminating in the direct read-off systems 
with digital recording, that has made observation more objective, in the 
sense that personal judgement in the description of what is observed, and 
thereby the possible influence of one’s private mind-set on one’s perception, 
has been more and more eliminated. But there was a price to be paid. Our 
observations have become less subjective, but more theoretical. 


ZENO G. SWIJTINEK 
SUNY at Buffalo 


REFERENCES 


SLUMP, C. H . [1984]: On the Statistical Analysis of Images in Low-Dose Electron Microscopy. 
Ph.D. dissertation, Univertsity of Groningen. 

SWIJTINK, Z. G. [1986]: ‘The Objectification of Observation: Measurement and Statistical 
Methods in the Nineteenth Century’, in The Probabilistic Revolution, Vol. 1: Ideas in 
History (eds., L. J, Daston, M. Heidelberger, L, Krüger). Cambridge. 


HOLLINGER, HENRY, B. and ZENZEN, MICHAEL J. [1985] The Nature of 
Irreversibility. Reidel Publ. Co. xi +340 pp. Hardback £33.25 (ISBN 
90-277-2080-0) 


This interesting and useful book is not quite what its title suggests for in 
fact it says rather little about the nature of irreversibility. Its central concern 
is a particular theoretical technique for dealing with irreversible processes 
as they occur under the special circumstances of dilute gases. Although the 
authors clearly felt that an exposition of this method provided them with 
clues about the nature and origin of irreversibility, those clues have not led 
them to a comprehensive study such as is proposed by the title. 
Nevertheless this book will prove very useful in the teaching of statistical 
mechanics, and especially of that part of the subject which deals with the 
transport properties of dilute gases. The authors have a pleasantly informal 
style of writing and have provided many illustrations for the purpose of 
making difficult ideas intelligible. They have given further assistance to the 
student, in the form of comfort in the face of difficulty, by paying attention 
to the history of Boltzmann’s own intellectual problems. Even so I felt that 
a tightening up of certain arguments was needed, particularly in Chapter 7 
where a peculiar distinction is made between ‘old’ and ‘young’ equilibria. 


Reviews 405 


But technical issues are not my present concern. The jacket claims that 
the volume will be of particular interest to philosophers of science (as well 
as to physicists and chemists), and therefore my task as a reviewer is to 
consider the book in relation to the Journal’s readers. 

A gap in the contents is the absence of any significant discussion of the 
interrelatedness of irreversibility and ‘time’s arrow’. A process cannot be 
deemed irreversible without having some clear distinction between the 
two directions along the t-coordinate. Although we normally rely on the 
distinction provided by consciousness, the question has to be asked whether 
this apparently subjective element can be avoided. One possibility is to 
make the Second Law provide its own criterion of ‘later than’ by choosing 
some particular process, such as the decay of a large sample of radium, to 
act as a signpost for all other processes. (‘“The description of the temporal 
sequence of events”, wrote Wittgenstein [1961], “is only possible if we 
support ourselves on another process.’’) Alternatively one might choose the 
standard reference process as the expansion of the universe itself. Yet 
neither of these methods is impeccable due to the possibility of fluctuations. 
Also in the case of the universe its expansion may ‘change’, in some sense, 
into a contraction (in the Friedmann k = +1 model and if there is sufficient 
hidden mass). But if ‘later than’ were defined in terms of ‘larger than’ the 
universe could not change into a smaller state at later time! 

The chapter which comes closest to philosophy of science is Chapter 5 
where the authors seek to establish criteria for reversibility. On pages 60— 
3 they put forward three candidates: (1) A process is reversible if it can be 
reversed through application of external influences; (2) A process is revers- 
ible if it can be reversed ‘“‘by reversing the motion variables”; (3) A process 
is reversible “if the process and its reverse both occur naturally”. 

The context makes it clear that (2) refers to the time-reversal invariance 
of the theory which is supposed to govern the process in question; (1) and 
(3), on the other hand, refer to what can be observed about the process itself, 
i.e., phenomenologically. My understanding of the author’s intentions is 
that they plump (and rightly, I think) for an amplification of (3). Thus a 
process is said to be reversible if the system in question, together with all 
parts of its environment, can be restored to their original states. (To this one 
really needs to add a clause about the restoration being made reproducibly, 
or at some specified time. Otherwise spontaneous restoration by fluctuation 
is not excluded. Denbigh [1981], pp. 108~9.) 

By Chapter 8 the authors have arrived at the consideration of fluid 
dynamics. The neatly-done discussion of the conservation equation, the 
stress tensor, etc., leads on to entropy production, and to an outline of the 
Onsager reciprocal relations. The stage is then all set for the treatment in 
most of the rest of the book (pp. 159—288) of the statistical mechanics of 
the irreversible transport processes in dilute gases. 

The equilibrium theory is first established using the method based on the 
assumption of equal a priori probabilities, the most probable distribution, 


406 The British Journal for the Philosophy of Science 


Stirling’s approximation and undetermined multipliers. The authors then 
turn to the transport processes—diffusion, heat conduction, viscosity— 
and use the resources of BBGKY theory at the two-particle and higher 
approximations for the purpose of allowing for interactions between the 
molecules of the gas. A necessary assumption is that the system in question 
has been recently withdrawn from a state of equilibrium. 

This is satisfactory for the purpose of creating a theory of the transport 
processes, but the question has to be asked whether it really helps in the 
understanding of irreversibility in general. The notion that irreversibility 
is simply a departure from equilibrium pervades the whole book, but it 
detracts from a sufficiently general, although qualitative, approach. Per- 
turbation about an equilibrium state surely has little bearing on many of 
the far-from-equilibrium processes, such as nuclear reactions in stars and 
the ‘clumping’ due to gravitational forces, which are prominent in the 
universe. 

In my view a comprehensive treatment of irreversibility will need to 
cover quite a wide variety of topics, over and above those discussed above. 
For example: Necessary and sufficient conditions for reversible behaviour. 
Nomological v. de facto irreversibility; ‘initial’ conditions and the inheri- 
tance from the big bang. The entropic peculiarity of systems involving 
long range forces; gravitational ‘clumping’. Possibly non-entropic forms of 
irreversibility such as the expansion of wavefronts (retarded potentials). 
Recurrence theorems. Systems which develop ‘chaos’. Biological and 
psychological irreversibility. 

Although I have been a bit critical I greatly enjoyed this book. It contains 
many insights and I particularly appreciated a hard-hitting remark on 
page 94 to the effect that coarse graining can explain the irreversibility of 
statistical descriptions but not the irreversibility of real phenomena. More on 
these lines would have greatly fortified the author’s chocie of title. 


KENNETH DENBIGH 
King’s College (KQC), University of London 


REFERENCES 


DENBIGH, K. G. [1981]: Three Concepts of Time. Springer-Verlag. 
WITTGENSTEIN, L. [1961]. Tractatus. 6.3611. Routledge. 


JAKI, STANLEY L. [1984]: Uneasy Genius : the life and work of Pierre Duhem. 
International Archives of the History of Ideas. Martinus Nijhoff 


Publishers, The Hague, Boston, Lancaster. Pp. xii+472, list of 
Duhem’s publications, indexes. £44.50. (ISBN 90-247-2897-5) 


An extended treatment of Duhem’s life and work was long overdue. Not 


only does he continue to command attention, but Duhem was a major figure 


Reviews 407 


in the science of his time and in the debates surrounding it: no account of 
these matters that does not give his rôle in them serious critical attention can 
be considered adequate. Jaki’s biography builds on recent work by others to 
give us a massive quarry of material for the work. Whether, on the other 
hand, it is the definitive biography we need, I doubt. 

On Duhem’s childhood, schooling and early formation, Jaki’s biography 
ig a clear gain. We now have the basis for a rounded picture of the 
ecclesiastical and family background, and of his time at the Collége 
Stanislas, material difficult to find elsewhere, and the same goes for his time 
at the Ecole Normale Supérieure. For the rest of Duhem’s career we have, 
on the other hand, much new interesting material for the researcher to 
follow up, but, it seems to me, the resulting picture is not a substantial 
advance on that given by Harry W. Paul. The book concludes with three 
general discussion chapters on the physics, the philosophy and the history. 
The first offers the interesting suggestion that after 1900 Duhem 
condemned his work to irrelevance by narrowing the scope of his theorizing. 
The second, correctly, sees Duhem’s emphasis on common sense as the key 
to his position, but to my mind misses the subtleties of the working out of 
that idea. The third seems to add little to my own discussions [Annals of 
Science, 31, 1976, pp. 119-29, and my thesis (University of London— LSE, 
1981)] beyond telling us that Jaki does not accept Duhem’s criticisms of 
Aquinas. 

So much for the gains. The losses soon obtrude. I have no space to do 
more than mention most of these. Besides the combative style, there is the 
amount of space wasted on pointless historical ‘colour’ whose basis is often 
merely speculative. Jaki has chosen to translate his extensive quotations 
from his sources, but readers with a knowledge of French will often be 
irritated by gaucheries of some of them—like (page 36) ‘passing’ exams for 
‘passer’ (= ‘sit’ exams). Arguments are often only suggested, not concluded. 
The major problem, though, is that Jaki has not addressed, let alone solved, 
the complex historical questions surrounding the actor Duhem, preferring 
to defend his honour and heap abuse on his opponents instead. Jaki does not 
stop to analyse the behaviour of Duhem and his various friends and enemies 
to see what exactly was at issue between them. It is not quite true that, for 
Jaki, it is Duhem (and the Catholic Church) right or wrong, but it 
sometimes feels like it. But analysis is what we need if we are to get any 
nearer the historical Duhem. 

The worker on Duhem is faced at the outset with two rival biographies, 
that of his student friend Edouard Jordan, written shortly after his death, 
and that of his daughter Héléne, with contributions by many hands, written 
20 years later. Jordan’s has a beauty and a coherence that gives us a sense of 
the man, but in one important area, the political, it is certainly not quite 
right. Jordan himself seems to have been sympathetic to liberal Catholicism 
and reconciliation with the Republic and to have made Duhem out to be at 
least not hostile to these views. There is, however, plenty of evidence that 





408 The British Journal for the Philosophy of Sctence 


Duhem, royalist by upbringing, held the Republic in very low esteem, 
seeing it (possibly rightly) as the source of many of the ills that had befallen 
French Catholicism and even France as well. Héléne, on the other hand, 
presented a picture of a royalist anti-semite who doubted the motives of the 
republicans who campaigned on behalf of Alfred Dreyfus in the go’s, and 
went on to become a supporter of Charles Maurras’ Action Frangatse, the 
movement of Comtian inspiration that saw in support for the Catholic 
Church and a restored monarchy the road to the restoration of the glories of 
a France free of Jews. It is possible, though, that Héléne too was painting 
Duhem in her own political image. If Jaki is right (page 90, fn. 65, source not 
given), she herself was a supporter of the movement. She concedes that 
Duhem had Jewish friends: one of them was Dreyfus’ brother-in-law 
Jacques Hadamard. The phrase ‘some of my best friends are Jews’ is 
proverbial, but this is not the only evidence that Héléne is no nearer the 
truth than Jordan. Another of Duhem’s student friends, the Catholic 
philosopher Maurice Blondel, used his ownership of the Annales de 
Philosophie Chrétienne to run a series of anonymous articles denouncing as 
un-Christian Action Frangatse and all it stood for. Duhem was associated 
with this journal, which published his ‘Physics of a Believer’ and To Save the 
Phenomena. Duhem’s surviving correspondence with Blondel also allows us 
a reasonable shot at interpreting his attitude to Blondel’s campaign. Despite 
its playful irony, it is clear that Duhem, far from disapproving, is cheering 
Blondel on. The real Duhem was neither that of Jordan, nor that of Hélène, 
but someone else, and those who seek to know him must first grapple with 
the discrepancies between these two accounts systematically. Jaki makes 
critical comments, but in the absence of this essential first step, he has ended 
up merely embellishing his predecessors with more facts and occasional 
corrections. He remains their prisoner. 

One other matter may be mentioned. A footnote on page 327 criticizing 
‘R. H. D. Martin’ refers to an unpublished early paper of mine on the theme 
of Duhem’s relation to the religious currents of his time. Readers interested 
in my considered views may refer to my ‘Darwin and Duhem’ (History of 
Science, 20, 1982, pp. 64—74.) 


> 


R. N. D. MARTIN 


DENNETT, DANIEL C. [1984]: Elbow Room: The Varities of Free Will Worth 
Wanting. Oxford. x +200 pp. 


Most books on Free Will tend to be fairly sober exercises in conceptual map 
making, taking the reader systematically over the well-trodden terrain, 
itemizing familiar landmarks and erecting various ‘no through road’ signs 
before tracing the author’s preferred exit pathway. By comparison, these 
Locke Lectures by Daniel Dennett have something of the precipitate 
momentum and acceleration of a rollercoaster. Given a minimum of 


Reviews 409 


tolerance for occasional informalities of style, reading Dennett is an 
intellectual treat from which one emerges refreshed, if a little tousled, with 
the philosophical horizon returning only reluctantly to its accustomed 
orientation. , 

What kind of freedom do we want? Certainly not that which ‘random 
swerves’ would give, says Dennett, nor the ‘freedom’ to defy physical laws. 
“What we want when we want free will is the power to decide our courses of 
action, and to decide them wisely, in the light of our expectations and 
desires. We want to be in control of ourselves, and not under the control of 
others. We want to be agents, capable of initiating, and taking responsibility 
for, projects and deeds. All this is ours, I have tried to show, as a natural 
product of our biological endowment, extended and enhanced by our 
initiation into society” (p. 169). “We are looking . . . for some elbow room 
for us sinners in between the saints and the monsters” (p. 157). 

Dennett’s main strategy is to uncover the misleading metaphors which, as 
he sees it, have distorted our imaginations when assessing supposed threats 
to freedom. “‘Anyone who dreads the prospect of not having free will must 
have some inkling about what that terrible condition would be like. And in fact 
there are a host of analogies to be found in the literature: not having free will 
would be somewhat like being in prison, or being hypnotized, or being 
paralyzed, or being a puppet, or... (the list continues) ... When we fear that 
we don’t have free will, it is always because we fear that something 
importantly like one of these dreadful things is our fate” (pp. 5 and 6). He 
lays much of the blame on the misuse of ‘intuition pumps’—thought 
experiments that become widely accepted as the basis of our intuitive ‘feel’ 
for a problem and can set unrecognized limits to what we imagine to be 
possible. Unhelpful ‘intuition pumps’, he argues, have been responsible for 
groundless fears, based on oversimplified visions of what science has to tell 
us about ourselves and the rest of the universe. With the aid of more 
appropriate thought-models, drawn mostly from information technology 
and automata theory, he claims that “‘if we try hard, we can imagine a being 
that listens to the voice of reason and yet is not exempted from the causal 
milieu ... whose every decision is caused by the interaction of features of its 
current state and features of its environment over which it has no control— 
and yet which is itself in control, and not being controlled by that 
omnipresent and omnicausal environment. Yes, we can imagine a process of 
self-creation that starts with a non-responsible agent and builds, gradually, 
to an agent responsible for its own character. Yes, we can imagine a rational 
and deterministic being who is not deluded when it views its future as open 
and ‘‘up to” it. Yes, we can imagine a responsible, free agent of whom it is 
true that whenever it has acted in the past, it could not have acted otherwise” 
(p. 170). 

There is much in this argument with which I wholeheartedly sympathize. 
Under close examination there is no rational force behind the idea that 
determinism (as distinct from fatalism) ‘‘erodes control”, or implies that the 


410 The British Journal for the Philosophy of Science 


past controls us (p. 72). There can be “real opportunities” even in a 
deterministic world; though an ambition to “change the course of history” is 
on Dennett’s view equally incoherent whether or not the world is 
deterministic. “It is often said that no-one can change the past. This is true 
enough, but it is seldom added that no-one can change the future either... . 
The future consists, timelessly, of the sequence of events that will happen, 
whether determined to happen or not” (p. 124). Having lucidly examined 
and despatched a whole succession of such fear-inspiring ‘““‘bugbears’’, as he 
terms them, Dennett is happy to reach the “‘optimistic’’ conclusion that 
“free will is not an illusion, not even an irrepressible and life-enhancing 
illusion” (p. 169). 

While endorsing this conclusion, however, I have some misgivings about 
Dennett’s argument at one or two crucial points. The main one comes in 
Chapter 6 on “Could have done otherwise”. After distinguishing between 
the concepts of logical, physical and epistemic possibility, he deplores the 
lack of interest shown by philosophers in the last, which he proposes as ‘‘the 
key to the resolution of the riddle about ‘can’. The useful notion of ‘can’, the 
notion that is relied upon not only in personal planning and deliberation but 
also in science, is a concept of possibility [that is] fundamentally epistemic” 
(p. 148; see also p. 113). Quite rightly, he disputes Jacques Monod’s claim 
that purely random coincidences must happen if evolution is to take place, 
arguing that a deterministic pseudo-random process could serve as well. 
“What evolution requires is an unpatterned generator of raw material, not an 
uncaused generator of raw material’’ (p. 150). He goes on, however, to 
suggest that in order that a human being ‘‘can do otherwise” it is likewise 
sufficient that the outcome of his choices be “practically unpredictable”. 
What matters, he claims, is the kind of ‘“‘practical independence” of initial 
conditions which obtains even in the deterministic Newtonian physics of 
chaotic systems, such as a pinball bouncing through a succession of 
obstacles. “There is no higher perspective (unless we count the perspective of 
an infinite being) from which the ‘accidental’ collisions of locally predictable 
trajectories are themselves predictable and hence ‘no accident’ after all” 
(p. 152, italics mine). But this will surely not satisfy either believers in an 
infinite being (dismissed without argument in the italicised parenthesis) or 
those who reflect that even in the case of Newtonian chaos, two identical 
systems released from identical initial conditions will always be found 
tracing identical (though technically “‘chaotic”’) trajectories. Whether 
accessible to us or not, they will argue, there presumably exists a determinate 
solution to the state equation governing the system, such that if only we 
knew it, we would be correct to believe it and in error to disbelieve it. The 
future of any classically determinate system, whether we know it or not, has 
one and only one specification, determined by its present condition, with an 
unconditional claim to our assent. 

The point missed in Dennett’s analysis is that, oddly enough, the human 
brain presents a unique case of a physical system to which the foregoing 


Reviews 411 


generalisation does not and cannot apply, even if the brain were fully 
deterministic in the sense assumed. For a specification of the future to have 
(unknown to us) an unconditional claim to our assent, it must be shown that 
our assent would have no consequences that would invalidate the 
specification. In the case of our own brain, I have argued elsewhere’ that 
the latter requirement cannot be met if we assume (as both Dennett and 
I would) that our assent would entail a significant correlated change in 
some cerebral subsystem. It follows that even on physically deterministic 
assumptions, no complete specification of the future of our cognitive 
mechanism exists (unknown to us) which can claim to be unconditionally 
beliefworthy-by-us.* This applies even from the perspective of an infinite 
and omniscient being. Dennett refers briefly to this argument in a footnote 
on p. 112, but draws only the epistemological conclusion that ‘‘my 
uncertainty about my own future is ineliminable’’. But in fact the crucial 
point here is not merely epistemic but ontological. The argument shows not 
merely that you cannot find out the detailed future of your brain, but that no 
spectfication of it exists, even unknown to you, which can claim that (if only 
you knew it) you would be correct to believe it and in error to disbelieve it. In 
that sense, even if you believe that your brain obeys deterministic physical 
laws, you are correct to believe that its future is in detail indetermininate- 
for-you, and indeed that it is up to you to determine at least some of the 
indeterminate details by the process we call thinking, valuing and choosing. 

Note that the foregoing argument does not deny that your brain state 
might be predictable-for-non~participants. Here Dennett’s point about the 
‘practical independence of initial conditions’ shown by chaotic systems has 
full force, at least in relation to human observer-predictors. I agree with him 
that the science-fiction image of a world of predictable human actions is a 
practical impossibility. I think it important, however, not to suggest that 
belief in our freedom to determine our future must shelter behind this 
purely epistemic barrier—especially if it dawns on us that there may, after 
all, be an infinite being for whom that particular barrier is no obstacle. 'This 
has a special bearing on the question raised on p. 165, whether the 
“absolutist ideal’? of ‘“‘total, before-the-eyes-of-God-guilt” must be 
abandoned. What Dennett fails to recognize is that from the perspective of 
God himself (if there be God) there can still be objective grounds on which 
to determine whether a criminal’s act was inevitable-for-him (and so not 
culpable) or else up to him to determine in the foregoing sense (and so 
culpable). Once it is clear that predictability-for-non-participants does not 
necessarily imply inevitability-for-the-agent, scientific determinism offers 
no threat even to the ‘absolutist ideal’ of moral responsibility. 

If the same point had been observed in Dennett’s discussion of “could 


! MacKay, D. M. On the Logical Indeterminancy of a Free Choice, Mind, 69, pp. 31-40, 1960. 
2 MacKay, D. M. Scientific Beliefs about Oneself. In: The Proper Study (G. N. A. Vesey, ed.), 
Royal Institute of Philosophy Lectures, Vol. 4, Macmillan, London, pp. 48-63, 1971. 


412 The British Journal for the Philosophy of Science 


have done otherwise” on pp. 132 ff., it could have led to different 
conclusions. Dennett (following and extending arguments by Harry 
Frankfurt), makes no distinction between the claim that ‘Jones could have 
been predicted-by-manipulators to do what he did’ and the claim that ‘Jones 
couldn’t have done otherwise’. This confounds two possibilities: 


(a) The manipulators may have interfered with Jones’s brain machinery 
(say by drugging him or injecting electric currents) in such a way that 
there now exists a prediction of his action which (if only he knew it) 
Jones himself would be correct to accept as inevitable-for-him, and 
would be in error to reject. Jones could correctly say: ‘I couldn’t have 
done otherwise’. 

(b) The manipulators may have merely engineered Jones’s enviromental 
circumstances so as to be able to predict (secretly) what Jones will do; 
but their prediction is one that has no unconditional claim to Jones’s 
assent, because in this case the cerebral consequences of its being 
believed by Jones would invalidate the basis of the prediction.’ Jones 
cannot correctly say: ‘I couldn’t have done otherwise’, because in case 
(b) the outcome is logically indeterminate for Jones until he makes up 
his mind; and its predictability for others offers no ground for 
denying that he could have done otherwise. The logic of personal 
agency is inescapable relativistic. 


I have left unmentioned a host of topics on which Dennett has ingenious 
and sometimes original things to say. “Where do reasons come from?” 
“Reflection, Language and Consciousness”; “Community, Communica- 
tion and Transcendence”; “The Uses of Disorder”; “The Problem of the 
Disappearing Self’; “Diminished Responsibility and the Specter of 
Creeping Exculpation” are among his more appetizing subtitles. The 
Rylean flavour of some of them is no accident— Dennett studied under Ryle 
and pays an admiring tribute to him before getting down to work. One need 
not be a Rylean, however, to derive exceptional pleasure, as well as profit, 
from this masterly performance. 

DONALD M. MacKAY 
University of Keele 


RIDLEY, MARK [1985]: The Problems of Evolution. Oxford University 
Press. Hard cover (ISBN 0 19 219194 2) £12.50 
Paper cover (ISBN o 19 289175 8) £3.95 


Despite its title, this book is not a critique of modern evolution theory. On 
the contrary, Mark Ridley is well satisfied with the present state of the 
subject. A valid theory of evolution, he tells us, must explain why evolution 


3 MacKay, D. M. (discussion of A. G. N. Flew). In: Brain & Mind (J. R. Smythies, ed.), 
Routledge & Kegan Paul, London, pp. 129-131, 1965. 


~ 


Reviews 413 


takes place, it must fit the facts of heredity, and it must explain why 
organisms are well adapted for life. Since he believes that the theory of 
natural selection—and no other theory—can do this, for him the problem of 
evolution is “effectively solved’. What remains, and these are the 
“problems” of the title, are occasional slight shifts of emphasis or interesting 
but relatively minor hypotheses about certain outstanding puzzles. 

“Not everyone, however,” Ridley acknowledges, ‘‘would agree.” Quite 
so, and for reasons which are a good deal stronger than the objections which 
are set up and summarily demolished in the book. 

In the first place, the task of a theory of evolution is surely to explain how 
the organisms we see about us have come to be, and why they are the way 
they are. This is a much broader question than Ridley suggests, and so the 
problem that he claims is solved is not what many readers will think of as the 
problem of evolution. 

Nor will everyone be satisfied with the sorts of explanation that are 
offered. Consider, for example, the discussion of a problem common to all 
theories which concentrate on adaptation: why are organisms not perfectly 
adapted? Why did God choose to create less than perfect creatures, or, if you 
prefer, how do we account for the non-existence of traits which, if they did 
exist, would confer obvious selective advantages? Ridley’s answer is that the 
absence of such organisms must be due to some kind of genetic constraint. 
Now there is obviously some form of constraint involved, almost by 
definition, but why must it be essentially genetic? There are, for example, no 
birds that can fly faster than light. It is certainly true that there are no genes 
for superluminal flight, but any physicist can think of a more fundamental 
explanation than that. 

Many readers will also be surprised at the frequency with which the 
confident word “facts” appears in this book. There is a lot in nature that 
Ridley is sure about, and there is apparently no question in his mind that 
facts and theory may be interdependent. Those who are not biologists, 
however, should also be warned that some of the so-called facts are by no 
means so certain as the word would imply, however we understand it. 

For example, we are told that the unit of selection must be something that 
is inherited in identical form from generation to generation and that it must 
therefore be the gene because “Except for mutations, which are very rare, 
genes are copied exactly each generation.” They are not. Intragenic 
recombination is a well known phenomenon and is described in many 
modern textbooks on genetics. Even Richard Dawkins acknowledges in his 
The Selfish Gene that the gene is not quite the permanent entity that bean- 
bag genetics, neo-Darwinist evolution theory, and Mark Ridley assume it to 
be. 

Ridley also attaches great importance to the doctrine of the independence 
of the germ line, or ‘‘Weismannism”’. “In fact,” he writes, “acquired 
characters are not inherited.” Again, not true. To cite just one example, 
treating normal flax plants with a certain fertilizer brings about in a single 


414 The British Journal for the Philosophy of Science 


generation heritable changes in morphology, isoenzyme pattern and DNA 
content. And even if one holds to a restricted view of heredity which 
excludes maternal effects and cytoplasmic inheritance, the past five or so 
years of recombinant DNA research have made the modern counterpart of 
Weismann’s doctrine, the so-called Central Dogma that information flows 
only in the direction DNA — RNA —> protein, no longer tenable. 

Even on issues where some neo-Darwinists are beginning to give ground, 
Ridley stands firm. For example, it is gradually becoming accepted that the 
nature of the developmental process does have an influence on the forms that 
appear in evolution. Some evolutionists (myself included) see it as very 
important. Neo-Darwininists naturally disagree, and at most write only of 
“developmental constraints” on the action of natural selection, which they 
still consider to be the sole creative force in evolution. Ridley is unwilling to 
concede even this much; for him the very existence of developmental 
constraints is mere speculation. 

Problems of Evolution presents a simplistic and outdated view of the 
subject, in which much of our modern understanding of molecular biology, 
of the process of development, even of physics and chemistry, is totally 
ignored. And this is precisely where its value lies. It can be very difficult for 
an outsider to find out exactly what neo-Darwinism is. Surprisingly few 
books on the subject bother to define it. Many of the authors who write in its 
defence make claims that it is much broader in scope than the mere natural 
selection of small random variations, and it is only after one has read a large 
number of neo-Darwinist textbooks and articles that one realizes this is not . 
the case. Ridley, in contrast, provides a concise and accurate account of the 
ideas on which the neo-Darwinist paradigm is actually based. Even on the 
issue of developmental constraints he is more consistent, since while other 
neo-Darwinists may pay lip service to the idea, they do not actually take it 
into account in their work. As Bernard Shaw put it, what a man believes may 
be ascertained not from his creed but from the assumptions on which he 
habitually acts. Philosophers of science who want to know what neo- 
Darwinists really believe will find this clearly written little book very 
helpful. 


P. T. SAUNDERS 
King’s College, London 


CURRIE, GREGORY and MUSGRAVE, ALAN (eds.) [1985]: Popper and the 
Human Sciences. Martinus Nijhoff. vii+211 pp. £29.25. (ISBN go- 
247-2998-X). 


This book is a landmark in offering critical yet quite appreciative studies of 
the philosophy of Karl Popper, by twelve authors most of whom are not his 
disciples. Yet it is hard to find who will benefit from their careful study. 
Popper connoisseurs will find them full of misrepresentations and of 


Reviews 415 


. criticisms which are mild or timid by comparison to those offered by some of 

` his leading disciples; others, presumably, will hardly show much interest. 
The myth characterizing the situation at hand still is the (true) story of 
Ludwig -Wittgenstein’s invitation to Popper to come to the Cambridge 
sanctuary to recant and repent, of Popper’s acceptance of the invitations so 
as to show defiance, and of the host storming out of the room in the middle of 
his guest’s performance. I suppose the two schools of thought represented in 
the story upset each other because the very situation is a loud criticism of 
both. Popper and his followers are busy, creative in an activity declared 
impossible by mainstream-Anglo-American philosophy, so-called. And 
this mainstream exhibits a form of rationality—Popper never allowed 
himself to dismiss them as he dismisses mainstream-Continental philo- 
sophy, so-called—which rationality has no room in Popper’s view since 
they flatly disregard his criticism and since he finds no valuable fruit of their 
labours. 

Yet some change has occurred, and this book is symptomatic of it. The 
process is one declared inevitable by Samuel Butler, and whose strongest 
example these days in the Marxist respectful study of Trotzky as 
mainstream Marxist, just as much as the right-deviationists Kautsky, 
Bernstein and Adler. Differences iron out with the gain of new perspectives 
and the passage of time. Yet, scores have to be settled even after the civilized 
burying of the hatchets, or else confusion reigns. Here, to my regret, no 
scores are settled, and perspectives hardly emerge, though civilized the book 
certainly is. 

Indeed, the authors often find it difficult to coordinate, and the paucity of 
Popperian studies of other, mainstream philosophers is no help. At times a 
complaint to this effect is explicitly stated (pp. 107, 116, 148, 167). The 
Popperian disregard of so much that is going on is due to lack of interest. 
The opposite course is often but not always the same. This had led Popper 
and others to a somewhat unphilosophic grudge—as if those who might or 
even would comment on Popperian thought owe some comments to the 
public. They do not: they are rational, but the rationality they exhibit is not 
the one Popper describes when he discusses rationality as the institutions of 
learning fostering the critical approach. Michael Polanyi and Thomas S. 
Kuhn exaggerate the other way when they discuss rationality as the 
institutions imposing uniformity. The truth, alas! is more complicated than 
either. 

Ontology is the major topic of discussion here, for no reasons that anyone 
offers—the editors could, perhaps should, have said something about their 
selection. I do not know how one should discuss critically any modern 
ontology. Classical ontology rested on the polarization of the world to reality 
and appearances or nature and convention. This enabled one to examine 
competing ontologies for their relative merit. But it is now passé. Common 
sense ontology is unproblematic and does not permit too much analysis, nor 
too much criticism. Moreover, common sense admits that concrete physical 


416 The British Journal for the Philosophy of Science 


things, living things, human individuals, societies and nations, and even 
concept, exist in different senses. Ontology goes further, but what makes a 
thing which by common sense really exists, an entity proper, what not? And 
why does it matter? One answer, cursory, I am afraid, can be found in 
Popper’s works: If A and B are entities proper, then we may hope no 
satisfactory scientific hypothesis will ever be found reducing the laws 
governing A to laws governing B alone. This claim unbelievably confuses 
satisfactory explanation with true explanation. 

The second and third themes of this volume are Popper’s critique of the 
doctrine of historical inevitability and his liberalism. 

Let me, then, sum up the volume’s content, except for the (unsatisfactory) 
editorial page and the (not representative) bibliography, in the order chosen 
(why?) by the editors. 


1. L. Jonathan Cohen. Popper’s postulate of the existence of the third 
world—of objective spiritual entities—is an essential part of his 
theory of growth of knowledge, not a metaphor (no explanation for 
this is offered). Yet is is a problematic world—it is splittable to parts 
not in communication with each other and is clustered with 
contradictions which entail (objectively) all statements, including 
unknown corollaries to given theories. Popper’s Darwinian analogy is 
anyway extremely objectionable. This is correct, but only the last 
point scores. 

2. Frank Cioffi finds two criteria for pseudo-science: one is 
irrefutability, the other this plus pretence. He rejects the one and 
endorses the other—on behalf of Popper. But Popper is ambivalent 
here. The most outrageous pseudo-science is a refutable refuted 
theory presented as refutable and confirmed. By this standard, a 
variety of the doctrines of meaning-as-use, presented as scientific are 
pseudo-scientific, presented as logical are pseudo-logical and 
presented as rational are pseudo-rational. Yet Popper proposes that 
the empirical is the refutable, including the refuted: we learn from 
experience by admitting refutations. 

3. Roland Puccetti. Popper cleverly identifies epiphenomenalism, 
parallelism, and the identity theory, and settles for nothing less than 
interactionist dualism, What makes the mind a true entity, contrary to 
epiphenomenalism? Parallelists may declare the mind a true entity 
proper. To deny this is to call everyone a heathen and an infidel who 
does not fully endorse one’s kind of religion. Interactionist dualism is 
objectionable unless souls are mortal. Sir Karl Popper is a mortalist 
yet he shares his book with Sir John Eccles, the immortalist. This 
shows clearly that there are different interactionist theories. Eccles 
thinks that during sleep the brain, not the mind, is shut off. This 
theory is apologetic, and overlooks those states of sleep, where mental 
activity—mental, not just brain—still goes on. Popper and Eccles 


Reviews 417 


speak of the experience of brain-mind interaction, but Eccles, at least, 
overlooks the experience of mind-atrophy which need not accompany 
brain-atrophy. Puccetti presents an effort to render interactionism 
falsifiable by a thought-experiment in which links between two sense 
organs and their two respective centres are switched around. The 
thought-experiment only shows him naive about brain centres. 
David Papineau ascribes to Popper the view that though souls do 
exist, societies do not. He defends this view. Any similarity between 
Popper a la Papineau and Popper himself is merely unavoidable: all 
avoidable ones are assiduously avoided. 

A. F. Chalmers claims that Popper’s version of methodological 
individualism is inconsistent. He too gets Popper wrong, though at 
least he realises the matter is problematic. Why is it so hard? The word 
‘methodological’ is opposed, Popper says, to ‘ontological’: ask not, do 
societies exist? and assume as a device of creating refutable theories 
that the societies and social institutions which individual actors find 
significant for their actions do effect their rational conduct. I have 
once ascribed to Popper the view that institutions exist but possess no 
ends or purposes or goals. I do not know if I was right. Chalmers 
thinks that Mill’s denial of the existence of institutions differs from 
Popper’s, in that Mill refers to human nature, but Popper does not. 
Yet Popper has shown that the denial forces one to fall back on human 
nature (the methodological myth of the social contract, he calls this 
point). 

Alan Ryan. Popper’s philosophy of science and his liberal philosophy 
are the same—more or less. Ryan fears Popper would be hostile to this 
thesis. Popper states it in his conclusion to his The Open Soctety: social 
institutions are conjectures that can be refuted and reformed, just like 
scientific theories. Ryan argues that liberalism can be constitutional— 
ala Kant—or fallibilist—a la Mill. Popper is a bit of both, but tending 
towards Kant. This is intriguing. It also connects: science rests on 
institutions fostering criticism, yet institutions are fallible: we boost 
them, perhaps, by constitutions. Constitutions, unlike laws, can be 
fallibilist! For my part, I think nothing can replace the principle of 
toleration as a supreme principle. 

Jeremy Waldron. Popper’s dualism of facts and decisions. It turns out 
that Popper believes in the entity called morality: he is a moral realist 
or objectivist. And his view of ethics is not very idiosyncratic. True. 
What is idiosyncratic, is his view that there are two moralities— 
collectivist (wrong) and individualist (right). 

Noretta Koertge opposes cultural relativism in a rather standard 
Popperian manner. 

Peter Urbach seeks an argument against the doctrine of historical 
inevitability as a whole, as giving rise to an objectionable approach to 
social science and thus pollutes any item it gives rise to. If valid, such 


418 


IO. 


II. 


I2. 


The British Journal for the Philosophy of Science 


an argument will render a priori objectionable any hypothesis fitting 
its mold. This is a very odd programme, since it is condemned by the 
fact that some important already refuted hypotheses were created to 
fit the doctrine of historical inevitability —most significant of which is 
Marx’s theory of the capitalist trade cycle. Some of Urbach’s 
criticisms of Popper’s work, however, are valid though far from new. 
W. A. Suchting admits much of Popper’s critique of Marx and shows 
that some passages of Marx support the doctrine of historical 
inevitability, others are written in defiance of it. He opts for ‘a 
Marxism of sorts’. This is a far cry from E. H. Carr’s influential What 
Is History? which opens by dismissing Popper—and Sir Isaiah Berlin 
too—and from the epoch-making review of the second edition of The 
Open Society in this Journal by J. D. Bernal, which denies all of 
Popper’s charges and identifies Marx’s position with one which most 
scholars would view as almost identical with Popper’s—without, 
however, Bernal admitting this to be the case, of course. 

Robert Ackerman’s essay discusses Popper’s position in Germany, 
where he had both a strong discipleship and a strong hostile 
opposition which in time is becoming friendlier. The misuse of 
Popper’s views in German politics hints at defects in it, especially in 
view of Popper’s blame for his predecessors on account of the misuses 
of their views. This is doubtful: in his Open Society Popper mentions 
the wrong accusation of Socrates on account of the fact that some of 
his disciples, notably Alcibiades, were tyrants. Ackerman’s argument 
invites further examination. 

Richard Kraut presents an interesting and careful criticism of 
Popper’s view of Socrates as a democrat and replaces it with a 
modified view according to which Socrates’ concern with ethics, not 
politics, led to a more complex view, though one on the whole 
favourable to democracy. Kraut notices, that this invites a solution to 
the socratic problem more satisfactorily than Popper’s. 


J. AGASSI 
Tel Aviv University and York University, Toronto 


Physical Interpretations 


EEO oe 
Relativity Theory 


LONDON: SEPTEMBER, 1988 


The British Society for the Philosophy of Science is 
sponsoring an international conference, of three days’ 
duration, “PHYSICAL INTERPRETATIONS OF 
RELATIVITY THEORY”, to review the development, 
status, and potential of the various physical inter- 
pretations of the Relativistic Formal Structure. It is 
planned to open on Friday, 16th September, and to 
close on Monday, 19th September, 1988. The location 
will be Imperial College, London. 


SECTIONS INCLUDE: 
“Physical Time and Relativity” 
“Privileged Reference Systems in Modern Physics and 
Cosmology” 
“Relativity and the Nature of the Physical Vacuum” 
“Ether Theory in the late 20th C.” 
“Relativity, Cosmology and Physical Theories of Related 
Phenomena” 
“Experimental Aspects of Relativity” 
“Formal Structures, Physical Interpretations and the 
Philosophy of Modern Physics” 


The organising committee includes Prof. C. W. Kilmister (London); 
Dr. P. Kroes, (Eindhoven); Dr. T. Sjodin, (Brussels); S. J. 
Prokhovnik, (New South Wales); Dr. S. V. Clube, (Oxford); Dr. M. 
F. Podlaha, (Munich); Dr. T. Morris, (N.P.L. Teddington); Dr. J. 
Whealton, (Oak Ridge, USA); Prof. G. Spinelli, (Milan); Prof. S. 
Bergia, (Bologna); and M. C. Duffy, (Sunderland). 


For more information, contact 
Conference Co-ordinator: M. C. Duffy, 
Mechanical Engineering Dept., Sunderland Polytechnic, 
Chester Road, Sunderland SRL 3SD, U.K. 
Tel: 091 567 6191 Ext. 107 





THE LONDON SCHOOL OF ECONOMICS AND 
POLITICAL SCIENCE 


LAKATOS 
AWARD 


IN PHILOSOPHY OF SCIENCE 


The closing date for nominations for the Lakatos Award 
is 15th April, 1988. The value of the Award will be 
£10,000. The Award will be for an outstanding 
contribution to the philosophy of science in the form of 
a book published in English during the last ten years 
(that is, in 1978 or later). Candidates must be nominated 
by at least three people of recognised professional 


standing. Nominators should give their grounds for the 
nomination and indicate the candidate’s age, since a 
preference may be given to younger scholars. It will be 
appreciated if three copies of the book are provided. 
Nominations should be marked ‘Lakatos Award’ and 
addressed to: The Secretary, The London School of 
Economics and Political Science, Houghton Street, 
London WC2A 2AE. 


The Award is endowed by the Latsis Foundation and 
administered, on behalf of The London School of 
Economics, by a Committee consisting of the Director 
of the School, or his deputy, as chairman, and 
Professors Hans Albert, Adolf Grinbaum, Alan 
Musgrave and John Watkins. The Committee will make 
the Award on the advice of an independent panel of 
selectors. 


The recipient will be expected to visit the School, and 
there deliver a public lecture of interest to a general 
audience. 





CAMBRIDGE 


Scientific Controversies 


Case Studies in the Resolution and Closure of Disputes in Science and 
Technology 
Edited by H. TRISTRAM ENGELHARDT, Jr and ARTHUR L. CAPLAN 
Scientific Controversies is offered as a contribution to the better understanding 
of the role of science and of the place of non-scientific interests in what may at 
first appear to be purely scientific undertakings. 
639 pp. O 521 25565 1 Hard covers £35.00 net 
0 521 27560 1 Paperback £15.00 net 


Mind, Machines, and Evolution 

Philosophical Studies 

Edited by CHRISTOPHER HOOKWAY 

A volume of original essays by philosophers and scientists dealing with 
philosophical questions arising from work in evolutionary biology and artificial 
intelligence. The book examines a number of issues related to the search for a 


‘naturalistic’ or sctentific account of experience and behaviour. 
188 pp. O 521 33828 X Paperback £8.95 net 





mm Cambridge University Press 


The Edinburgh Building, Shaftesbury Road, Cambridge CB2 2RU, England 





Anniversary 


Philosophy Special 


Teaching Philosophy, edited by Arnold Wilson of the University 
of Cincinnati, is celebrating ten years of publication by offering 
subscriptions to new individual subscribers at a special anniversary 
rate of only $10 per year. The regular subscription rate is $17 for 
philosophers and $38 for institutions. Orders must be pre-paid and 
received by November 30, 1987 to get this special low rate. 


k k k 
To help you discover the latest trends in the teaching and learning 
of philosophy, Teaching Philosophy publishes articles, discus- 
sions, reports, and reviews on these and related topics: 


$ Teachin ie 






è theoretical issues in the teaching of philosophy 

© experimental and interdisciplinary philosophy courses 

è evaluation of teaching and assessment of learning in philosophy 
@ new books and audio-visual materials 


x * * 


Send orders to: PHILOSOPHY DOCUMENTATION CENTER, Bowling Green 
State University, Bowling Green, OH 43403-0189. ® (419) 372-2419 








NOTRE DAME 
JOURNAL OF FORMAL LOGIC 


Editors: Michael Detlefsen Mark E. Nadel 
Kit Fine Howard K. Wettstein 


The Journal aims to provide a common ground where both philosophers and 
mathematicians can read and publish significant work in all areas of philosoph- 
ical and mathematical logic; philosophy of language and formal semantics for 
natural languages; and the philosophy, history and foundations of logic and 
mathematics. 


Editorial correspondence should be addressed to: 
The Editors 
P.O. Box 5 
Notre Dame, Indiana 46556 
U.S.A. 


Business correspondence should be addressed to the business manager, Michele 
Daugherty. 


Subscription Rates: The Journal is published quarterly (January, April, July, 
and October) with annual subscriptions being $40.00 for institutions and $20.00 
for individuals. All back issues are currently available. 


dialectica vol. 41, 1987 


Fasc. 1-2 


Proceedings of the Colloquium Norms and Conventions, May 1-4, 1986 
Actes du colloque Normes et Conventions, 1-4 mai 1986 
Akten des Kolloquiums Normen und Konventionen, 1.-4. Mai 1986 


Contents Sommaire Inhalt 

Avrum Stroll, Norms; Henri Lauener, Philosophie als normative Tätigkeit (offener 
Transzendentalismus versus Naturalismus); Gilles Gaston Granger, Conventions, 
normes, axiomes dans la connaissance des faits humains; Hilary Putnam, Truth and 
Convention: On Davidson’s Refutation of Conceptual Relativism; Neil Tennant, Con- 
ventional Necessity and the Contingency of Convention; Wilhelm K. Essler, Sprache und 
Konvention; Rudolf Haller, Regelbrauch und Übereinkunft; Kuno Lorenz, Is and Ought 
Revisited; Duen Marti-Huang, The “Is” and “Ought” Convention, Jules Vuillemin, La 
justice par convention; signification philosophique de la doctrine de Rawls. 


Subscriptions Abonnements Abonnemente Switzerland Other countries 

Payment in in other currencies 
Subscription rate per annum (4 issues) SFr. (S, £etc.) 
Abonnement annuel (4 fascicules) 65.-SFr. 80.-SFr. + 8.-SFr. 
Jahresabonnement (4 Hefte) 


Distribution/Ausileferung 

Dialectica, Case postale 1081, 2501 Bienne (Suisse) 

F.W. Faxon, Stechert Coordinator, 15 Southwest Park, Westwood/Mass. 02090 USA 
B.H. Blackwell Ltd., Broad Street, Oxford, England 

















INQUIRY Editor: Alastair Hannay 


An Interdisciplinary Journal of Philosophy 
Volume 30 (1987), No. 3 

Special number on parapsychology 
Guest editor: Magne Dybvig 


Foreword 

Parapsychology and the Mind-body Problem John Beloff 
How Parapsychology Could Become a Science Paul M. Churchland 
Parapsychology and the Demarcation Problem Robert L. Morris 
On the Philosophy of Psi Magne Dybvig 
Psi and Our Picture of the World Stephen E. Braude 


Locke vs. Hobbes in Gauthier’s Ethics 

(David Gauthier: Ethics by Agreement) Richard J. Arneson 
Androcentric Science? (Sandra Harding: The Science Question in 

Feminism) John Chandler 
Books Received 


Subscription price: 

Institutions: USD 60.-. Individuals: USD 31.-. 
Published quarterly by Norwegian University Press, P.O. Box 2959 Tayen, 0608 Oslo 6, 
Norway ~ Publications Expediting Inc., 200 Meacham Ave., Elmont, NY 11003, USA. 
Editorial address: Institute of Philosophy, P.O. Box 1024 Blindern, 0315 Oslo 3, Norway. 


The Journal of Philosophy 


SUBSCRIPTIONS 


Individuals 

Libraries and institutions 

Students, retired/unemployed philosophers 
Postage outside U.S. 


BACK VOLUMES 


Published bi-monthly 1904-1976, monthly thereafter. 


All back volumes and individual issues are available, also a 
Cumulative Fifty-year Index, 1904-1953 and a Ten-year 
Supplemental Index, 1954-1963. Back volumes 1976- 
present also available in microform. 


Please inquire for price lists and shipping charges on back 
orders. 


709 PHILOSOPHY HALL, COLUMBIA UNIVERSITY, NYC 10027 





~~ Prof. Hector-Neri Castafieda, editor 
OUS Prof. Earl Conee, associate editor 


VOLUME 20, NO. 4 DECEMBER 1986 


Articles: Ruth Garrett Millikan/ The Price of Correspondence Truth, Alan Nelson/New 
Individualistic Foundations for Economics; Jonathan Wolff/Bamett, Bargaining and the Nash 
Solution; Richard Sharvy/Ptato ’s Causal Logic and the Third Man Argument; Richard M. 
Gale/A Priori Arguments from God's Abstractness; William Hasker/A Refutation of Middle 
Knowledge. 

Critical Reviews: David Wong by Terrance McConnell; David Charles by Alfred 
R. Mele; Lehrer and Wagner by Robert F. Bordley; Brian Skyrms by Henry E. 
Kyburg, Jr.; Helmut Peukert by C.G. Prado; Albert Borgmann by Alex C. Michalos; 
Richard J. Bernstein by Niels Ole Bernsen; Henry E. Allison by Jill Vance Buroker. 


VOLUME 21, NO. 1 MARCH 1987 
1987 A.P.A. CENTRAL DIVISION MEETINGS 


Symposium Papers and Abstracts: Jon P. Jarrett/Bell’s Theorem: A Guide to the 
Implications, Ted Klein/The Ethtcal Intention and Moral Transactions; Rudolf A. 
Makkreel/ Hermeneutics and the Limits of Consciousness; Thomas V. Morris/ Perfect Being 
Theology, William J. Wainwright/ Worship, Intutions and Perfect Being Theology. 
Abstracts of Invited Papers: David Copp/Moral Skepticism; Susan L. Feagin/Desires, 
Curiosity, and Appreciating Fiction; Donald Gustafson/Agency, Claudia Murphy/Fodor 
on Observations; K.M. Sayre/ Two Models of Division in Plato’s Late Dialogues; Richard 
Severens/Mighthavebeens, Peter van Inwagen/ When is the Will Free?. 

Critical Reviews: Myles Brand by James E. Tomberlin; John Yolton ġy James Kelly; 
Tomberlin and vah Inwagen by Graeme Forbes; Julius A. Elias by Richard Kraut; 
Douglas N. Walton y Henry Johnstone, Jr.; Rockmore, et al. by Philip T. Grier; 
L. Nathan Oaklander by George Schlesinger; Adolf Grinbaum y Edward Erwin; 
Hilary Putnam by Earl Conee; Christopher Hookway by Ruth Garrett Millikan; 
Andrew Levine by David Schweickart. 


Subscribe to: Dept. of Philosophy 1987 Subscriptions: 
Nous ‘/o Secretary Indiana University Institutions: $40.00 

Submissions to: Sycamore 126 Individuals: $20.00 
Nos ‘/- Editor Bloomington, IN 47405 Single Issues: $10.00 





Brit. Y. Phil. Sci. 38 (1987), 419-440 Printed in Great Britain 419 


Purpose and Scientific Concept 
Formation* 


by ERNEST W. ADAMS and WILLIAM Y. ADAMS 


I Introduction 

2 WYA’s typology of Nubian potsherds 

2.1 General background 

2.2 Ware-descriptions 

2.3 Ware-concept acquisition 

2.4 The uses of descriptions in terms of attribute-variables 

2.5 The general concept of a ware in WYA’s system 

3 Generalising to other scientific concept systems 

3.1. Preliminary remarks 

3.2 The Mohs hardness scale 

3-3 Biological species, and the species controversy 

Appendix A ware description in the Nubian pottery typology of W. Y. 
Adams 


xr INTRODUCTION 


The scientific concept formation with which we are concerned is that 
which occurs when technical terms or systems of terms are introduced or 
deliberately modified by scientists in the pursuit of their scientific objec- 
tives. We will advocate a ‘philosophy’ of this sort of concept formation in 
which the purposes for which the terms are introduced and employed are 
central and various features of their introduction and use are explained 
‘functionally’ in terms of these purposes. We will also argue that many of 
the qualities that are often thought to be definitive of the scientific are 
‘accidental features’ that are fairly well approximated in certain cases, but 
insistence that all scientific concepts should possess these qualities can 
also be counterproductive to the actual and legitimate purposes of many 
scientific activities. Among these stereotypes are that scientific concepts. ` 
should be precise, objective, and subject to observational determination (the 
latter two have been extensively criticised in the Kuhnian tradition, but we 
will criticise them here from a different point of view). The failure to 
recognise that these qualities are desirable only to the extent that they serve 
scientific purposes, and they are not ends in themselves, stems from the 
failure to recognise the purposes for which concepts are employed and from 


Recetved 


* The authors are greatly indebted to Professors Kent Holsinger and Paul Teller, and to an 
anonymous referee, for discussions and suggestions on earlier drafts of this paper. 


420 Ernest W. Adams and William Y. Adams 


mistaking properties that are frequently approximated for attributes that 
are essential to the scientific. 

Something very close to the foregoing was previously advocated by 
E. W. Adams (henceforth ‘EWA’) in the special case of systems of quantita- 
tive measurement (EWA [1966]). In the present paper we will concentrate 
on a different kind of concept formation, namely the construction of scien- 
‘tific typologies, and we will select a particular one of these as a case-study. 
This is an archeological typology developed by W. Y. Adams (WY A’s sys- 
tem, WYA [1986a]) which involves the classification of pottery fragments 
(potsherds) recovered in the excavation of archeological sites in the Nubian 
region of the Nile valley, statistics on which are to be used for the purpose 
of estimating the ages of these sites. We will begin by describing this system 
in some detail and drawing attention to certain features of it that appear 
aberrant when judged by conventional stereotypes. Then we will attempt 
to explain how the seemingly aberrant features in fact serve the purposes 
for which the typology has been evolved. After that we will consider briefly 
the question of whether the conclusions we arrive at concerning WYA’s 
system apply to other typologies, and to scientific concept systems in 
general. In particular we will consider biological species classification, and 
one rather elementary quantitative concept system, namely the Mohs scale 
of hardness measurement. Except for occasional brief asides on the rel- 
evance of our findings to more general issues in the philosophy of science, 
we will eschew comment on these matters since we plan to discuss them in 
future publication. 

Before starting a word should be said in justification of our focusing on a 
case-study in one of the least exact and least formally sophisticated sciences, 
which therefore could seem to be too unrepresentative to draw general 
conclusions from. There are two things to be said in favour of this. One is 
that it is essential to study processes of scientific concept formation ‘from 
the inside’, in a way which allows the objectives of the scientists evolving and 
employing them to be ascertained with some assurance. That is something 
which we can do given that the author of WYA’s system is one of the 
authors of this paper. The other point is that it is often less easy to discern 
the rationale for aspects of mathematically sophisticated concept systems 
than it is to do this in the case of simpler ones, simply because in the case 
of complex systems one’s entire effort is often devoted just to mastering 
their technicalities, We will try to make it at least plausible that conclusions 
about a concept system in a very ‘down to earth’ science generalise to some 
extent to more sophisticated sciences, but it is important that the case-study 
that illustrates them should be understandable apart from technicalities. 


2 WYA’S TYPOLOGY OF NUBIAN POTSHERDS 
2.1 General background 


The system we are concerned with has been evolved in the course of 


Purpose and Scientific Concept Formation 421 


research on archeological sites in the Nubian region of the Nile valley which 
extends over a period of 28 years, during which time WYA has served as 
field director of several excavations (for reports see WYA [1961], [1962a], 
[1964], [1965] and [1970], Adams, Alexander and Allen [1983], Plumley and 
Adams [1974], Plumley, Adams and Crowfoot [1977]). A fundamentally 
important problem that arises in the course of such excavations is that of 
estimating the dates of so called proveniences, which are ‘minimal units of 
excavation’, typically measuring a square metre in area by 10 cm in thick- 
ness. It is the provenience rather than the site as a whole that it is important 
to date, given that a Nubian site will typically have been inhabited over a 
very long interval of time, say from c.200 AD to c.1z00 AD. Only the 
exceptional provenience is directly datable by reference to artifacts such as 
dated writings, and dates for the others must be estimated indirectly. This 
estimation generally makes use of materials associated with the pro- 
veniences that exhibit systematic chronological variation. The materials 
best suited for this vary with the region of excavation, but one widely used 
method makes use of the fact that ceramic fragments (sherds) tend to be 
both well preserved and very numerous (a typical Nubian provenience 
yields between 300 and 500 of them), and their types can be shown to vary 
in a way that is uniform for sites in a region and which is distinctive of the 
eras of the proveniences in which they are found (see WYA [1986a]: pp. 
617-33; WYA n.d.). 

The problem faced by the archeologist developing a sherd typology 
for the purpose of chronological estimation in a given region is that of 
determining the correlations which prevail in that region between the attri- 
bute-variables of the sherds, and the eras of the vessels of which the sherds 
are fragments. This is usually very difficult. WYA’s present system, which 
began with an ‘Urtypologie’, WYA [1962b], in which sherds were classified 
into 27 categories called wares (‘ware’ is the standard term for a category 
in a sherd typology) has evolved into the present system involving 102 
wares, which is described in detail in WYA [1986a]. This evolution required 
the analysis of data on more than one million sherds collected from inde- 
pendently datable proveniences (without any assistance from computers)). 
Independent dates were arrived at principally through the recovery of dated 
written materials from proveniences. The present 102-ware system is the 
one we will focus on in our case-study. We will shortly note seemingly 
aberrant features of the system, but first we will describe the procedure by 
which the era of a provenience is estimated from sherd statistics. 

As noted, between 300 and 500 sherds are recovered from a provenience 
in a typical Nubian excavation. These are usually sorted on the day they 
are recovered (quick feed-back is important because date estimates to some 
extent determine ‘where to dig next’). Once sorted the numbers of sherds 
in each ware-category are recorded, along with the number of sherds that 
are simply listed as ‘unclassified’, whose quantity, which can run as high 
as 10 per cent, may also be chronologically significant. This sorting is 


422 Ernest W. Adams and William Y. Adams 


facilitated by the fact that any given provenience is apt to yield significant 
numbers of sherds in only about 15 categories, with ‘trace proportions’ 
of some 20 more (cf. WYA n.d. and WYA and EWA Archeological 
Typology and Practical Reality, in preparation). These numbers are then 
converted to percentages, which are recorded on a cluster index card for the 
provenience, which exhibits the percentages in quasi-graphical form. The 
final stage is the comparison of the cluster-index card with a set of 11 
‘master profiles’ of sherd proportions characteristic of given eras, to deter- 
mine which of these the provenience’s profile most nearly resembles. The 
era of the ‘closest fitting’ master profile is used as the estimated era of the 
given provenience (cf. WYA [1986b]). The durations of these eras give a 
rough measure of the accuracy of the method, which at the present stage 
of development vary between 50 years and 200 years in length (as compared 
with an average era-duration of 300 years for the estimates based on the 
Urtypologie). This level of accuracy might seem unacceptably crude to 
those used to the methods of the exact sciences. In fact the estimates are as 
good as those presently obtainable by any other means (neutron decay, etc., 
cf. Peacock [1970], p. 378) and they are far less costly in both time and 
money. But that is not to say that the method cannot be improved upon, 
and we will return to this possibility later. 

But now we will comment on the ware-descriptions, and on the way in 
which the ware-concepts are acquired. This will bring to light some of the 
apparently aberrant features of the typology. 


2.2 Ware-descriptions 


The reader is now urged to glance at the description of a typical one of the 
102 wares in WYA’s system, the so called TERMINAL CHRISTIAN 
WHITE WARE, or more briefly W14, which is included as an APPENDIX 
to this paper. What is most immediately striking about the description is 
its complexity, which makes it evident that it has not been designed merely 
to facilitate sorting and statistical summary. It is also obvious that the 
description is not a definition in the logician’s sense that it formulates a 
system of independent conditions which are separately necessary and jointly 
sufficient to define the concept uniquely (cf. Suppes [1957], chapter 8). In 
point of fact there are no logical definitions of the wares, so we have a 
system of concepts without such definitions. This raises the question of 
how persons can acquire the concepts. That will be returned to below, but 
first we will comment on three features of the description that would be 
aberrant if it were taken for a logical definition. 

Wr4, like the other 101 wares in WYA’s system, is described in terms of 
seven groups of attribute-vartables, or simply variables, under the headings 
CONSTRUCTION, FABRIC, SURFACES, etc., all but the first of which 
break down in turn into more specific variables such as paste, density, and 
so on in the FABRIC group. There is a total of 42 of these overall. We 


Purpose and Scientific Concept Formation 423 


follow Hodson [1982] here in using the term variable essentially in the sense 
of the statistician’s random variable. These can be regarded as functions 
whose values are specific ‘attributes’, which we will call attribute-values, 
which at least roughly partition the entities in their domains (here pot- 
sherds) into mutually exclusive and exhaustive classes.' For instance, the 
density variable roughly partitions the domain of potsherds into five classes 
described by the attribute-terms ‘porous’, ‘medium’, ‘fairly dense’, ‘dense’, 
and ‘very dense’. These happen to be ordered, but that is inessential. 
In fact many other variables, such as principal style in the PAINTED 
DECORATION class, are amorphous. What might seem aberrant is that 
many of these partitionings are vague, even where precise distinctions could 
be made. Moreover, they are occult in the sense that they are not subject to 
direct observational determination,” and their totality overdetermines the 
concept supposedly being defined. We will comment briefly on each of 
these. 

Vagueness and overdetermination are the most obvious and they are the 
easiest to account for. Consider density. The density attributed to sherds 
of type W14 is vaguely described as ‘medium’, though obviously precise 
physical values could have been specified and should have been insisted 
upon if precision were desirable for its own sake. But to insist upon it in 
the present case would not have served the purpose for which the sherds 
are classified and it would have been practically counterproductive. There 
is no reason to think that more exact physical measures of density would 
lead to more exact chronological estimates than the vague ones do, and 
requiring more exact measurements would be counterproductive because 
it would delay potsherd sorting and estimation when quick feedback is 
important. 

Next consider overdetermination. Even allowing for their vagueness the 
42 attribute-variables are neither logically nor factually independent, and 
moreover they are so overly specific that it is rare to find sherds exhibiting 
all of the attribute-values supposedly definitive of the ware in any one of 
the 102 categories. Of course this only underlines the point made earlier: 
that the ware-description really is a description, and not a definition. It is 
more plausible to regard the basic description as being of something like a 
prototype whose adequacy is commented on in the APPRAISAL, which 


1 It is customary in the typological literature to distinguish between the attribute-variable, 
e.g., density, and its value, e.g., medium, the former being called variables and the latter being 
called ‘attributes’. Paul Teller has drawn our attention to the fact that our distinction between 
attribute-variables and their values corresponds closely to W. E. Johnson’s distinction 
between determinables such as colour, and determinates, such as the particular colour red. 
See Johnson [1921], p. 174. 

? This occultness is practical rather than in principle unobservability. For our purposes 
the distinction is unimportant, since the obstacle they present to sherd-identification and 
ultimately to chronological estimation is the same. 


424 Ernest W. Adams and William Y. Adams 


gives indications of range of variability and related matters (in some areas 
of the world there are officially designated “type sherds’ —1.e., prototype 
sherds—see Colton and Hargrave [1937] and Colton [1953], p. 29). This 
sort of ‘prototypical characterization and appraisal’ is very important func- 
tionally, though we cannot enter into details in analysing it. But there is 
one observation to make in passing, which relates overdetermination to 
something analogous that arises in the determination of physical quantities. 

The analogy is to an apparent redundancy which can arise in the esti- 
mation of physical quantities. Two cases in point are that in which the same 
quantity is measured repeatedly in order to average and get a better estimate 
of it, and that in which an object is triangulated on from several directions 
to get a better fix on its location. There is apparent redundancy because 
only one measurement or two sightings would be sufficient if they were 
perfectly exact. But in reality the individual readings and sightings are not 
exact, and to obtain exactness one averages, either intuitively or math- 
ematically. We suggest that something similar occurs in the process of sherd 
classification, where vague ‘readings’ on individual attribute-variables are 
averaged in some intuitive fashion to get a better fix on a sherd’s ‘position 
in the ware-space’ than could be gotten from just one or a few readings, 
even though these few would suffice if they were perfectly exact.’ We cannot 
go farther into this here, though we hope to analyse it in future publication 
from the information-theoretic point of view. But now we turn to the 
problem of accounting for the fact that in many cases it does not seem 
possible to determine the values of attribute-variables by direct observation. 

There are two kinds of occultness that are noticeable in the attribute- 
values mentioned in the description of W14. One is simply due to the fact 
that while it is the entire ceramic vessel that has the attribute-value, what 
the sherd-classifier observes and must classify are fragments of the vessel. 
For instance, strictly speaking when a sherd is described as having the 
‘form’ attribute-value cup, what this means is that it is a fragment of a cup. 
But of course the sherd-classifier cannot literally see that the fragment 
being examined came from a cup, and therefore if we were to accept 
the observation-inference distinction we would have to conclude that the 
principal form is not observed but inferred (our study can be viewed as 
calling traditional ways of drawing this distinction into question, though 
not the distinction itself; we expect to discuss this issue in future publi- 
cation). But even if entire vessels could be observed, there are certain 
attribute-values that could not be ascertained conclusively. One is that of 
being wheel-made, which falls under the heading of CONSTRUCTION. 
This means that the vessel was formed with the use of the potter’s wheel. 


1 This does not necessarily reify an objective place in an absolute space of wares. The question 
of whether systems of judgements (e.g., as concerns classifications of sherds) that are not 
perfectly consistent can be interpreted as though they were approximations of ‘exact’ judge- 
ments cannot be pursued here. A closely related issue concerning Mohs scale hardness 
judgements is discussed in the Appendix to EWA [1966]. 


Purpose and Scientific Concept Formation 425 


But given that vessels of type W14 are estimated to have been manufactured 
between 1300 and 1500 AD, and given the impossibility of time-travel, 
there is no way that a sherd-classifier in the 19808 can ascertain by direct 
inspection that the fragment being examined came from a vessel formed on 
a potter’s wheel some 500 years ago. The problem that we are faced with 
is that of explaining ‘functionally’ why it is that descriptions of classi- 
fications that are designed for practical chronological estimation contain 
references to occult attribute-values. These points will be returned to, but 
first we will comment on ‘practical’ ware-concept acquisition. 


2.3 Ware-concept acquisition 


That the wares in WYA’s system are not logically defined does not mean 
that the only things that the classifier has to go on in acquiring the concepts 
are the written ware-descriptions. Hands-on study with specially compiled 
sherd-collections under the supervision of experts trained in the method is 
also of great importance (cf. Shepard [1965], p. 306; Thomas [1972]; Whal- 
lon [1972], p. 15). In this process the typology manual, WYA [1986a], which 
contains all of the ware-descriptions and a great deal else besides, serves in 
part as an aide-mémoire, which assists the student in assimilating the very 
complex skills that are involved in sherd-recognition. Persons having had 
laboratory courses such as in mineral-identification will be familiar with 
this type of instruction (in this section we will confine ourselves to claims 
about concept-acquisition in WYA’s system, but some speculative remarks 
about possible generalisations are made in section 3.1, and Paul Teller has 
suggested in correspondence that something like this is also true of the 
acquisition of the concepts of physics, where practice under the guidance 
of a tutor is essential, e.g., in learning to design and conduct experiments, 
and interpret their results). But, though the manual for WYA’s system is 
unusually comprehensive, it is doubtful that one could become proficient 
in sherd-recognition just from studying it. We next note certain specific 
features of laboratory instruction and its aims. 

One thing that laboratory training teaches is familiarity with what might 
be called the recognition-signs of attribute-values, whether or not they are 
occult (Dunnell [1971], pp. 45-6, distinguishes between significata, the 
attributes used to define classes, and denotata, the attributes used to 
assign objects to classes). For instance the practical sign that a sherd is a 
fragment of a vessel formed on a potter’s wheel is the evenness of its surface 
in comparison with those of sherds from hand-made vessels, which tend to 
exhibit the sorts of uneven ridges that would be expected when a vessel is 
smoothed with the fingers while being held stationary. Given this ‘oper- 
ationalization’ of the concept of being wheel-made we might be inclined to 
wonder why the occult attribute-value should be referred to at all in the 
ware-description. In fact the same question could be asked about the recog- 
nition-signs themselves. 


426 Ernest W. Adams and William Y. Adams 


The aim of the laboratory instruction we are considering is the attainment 
of proficiency in sherd-recogmtion. The fundamental test of proficiency is 
simply ‘getting the right answer’, which in practice means getting the same 
answer as the expert. And, of course that is what is wanted because what 
experience has shown is that it is the expert’s answers that give the basis 
for reliable chronological estimation. But if that is all that counts then even 
proficiency in attribute-sign recognition is secondary. This leads to two 
points of fundamental importance, one theoretical and the other practical. 
The theoretical point is that the ‘necessity’ that connects the attribute- 
recognition signs with the ware-concepts is more plausibly regarded as 
psychological than as logical. It is indubitable that persons unable to deter- 
mine the evenness of surfaces are less able to make ware-classifications that 
agree with those of the experts than others can. But that is a fact of empirical 
psychology. If someone were miraculously able to obtain answers in agree- 
ment with the experts even though bereft of normal sensory capabilities 
(perhaps such a person would be described as having the gift of being able 
to ‘look directly into the past’) there would be no reason to deny that he or 
she had grasped the ware-concepts, and there would be every reason to 
utilise his or her findings in arriving at chronological estimates. 

The practical point about recognition-proficiency is even more impor- 
tant. In fact about 90 per cent agreement with the experts may at times be 
regarded as acceptable, and that is as it must be given that the experts 
themselves hardly attain higher levels of agreement in sherd-classifications 
done under field conditions. Furthermore this level of intersubjective 
agreement is normally all that is wanted in Nubian Archeology, because 
there is no reason to think that higher levels would significantly increase 
the accuracy of chronological estimates. Objectivity in the sense of the attain- 
ability of intersubjective agreement is not an end in itself. Furthermore the 
fact that there are acceptable levels of it highlights as nothing else does the 
importance of taking purpose into account in the analysis of scientific 
concept formation, since what is acceptable depends on the purposes of the 
users of these concepts. 

But now that we have stressed the importance of laboratory instruction 
in the acquisition of the ware-concepts, we can make some observations on 
the role that the occult attribute-variables play in the ware-descriptions. 


1 This is discussed at length in WYA and EWA, Archaeological Typology and Practical Reality, 
in preparation. Paul Teller has pointed out that this is also true of physics and astronomy. 
This is also connected with the metaphor of observers as measuring instruments (“A human 
being is, from the point of view of physics, a certain kind of measuring apparatus”’—van 
Fraassen [1980], p. 17). This would fit in with the point made below that theorising about 
the perceptual and interpretive processes involved when scientists make observations is 
dubious psychologising given the present state of perception theory and cognitive 
psychology. The scientist’s ‘use’ of observers whose psychology they don’t understand is 
analogous to their use of instruments whose principles they don’t understand. However, this 
is not to commit us to the aptness of the observer-measuring device metaphor. 


Purpose and Scientific Concept Formation 427 
2.4 The uses of descriptions in terms of attribute-variables 


The sorting of sherds into ware-categories is to an extent a matter of 
educated guess-work in which the recognition of very complex and probably 
not wholly analysable ‘cues’ plays an important part [Clarke [1968], p. 
190; Dunnell [1971], pp. 45-6). It is plausible to conceptualise a ware- 
categorisation judgement as a probable inference whose premises are attri- 
bute-sign cues and whose conclusion is a ware-category judgement. This 
conceptualisation suggests that the occult attribute-values that appear in 
the ware-descriptions might fit into the picture in the following way. Con- 
sider the attribute-value of being a sherd of a wheel-made vessel. The 
judgement that a sherd has this attribute-value can also be regarded as a 
probable inference, but it is much more probable and much less complex 
than the judgement that it is of a particular ware-category, say W14. Fur- 
thermore we can say, counterfactually, that if it were possible to establish 
that the sherd was wheel-made and had the other attribute-values in terms 
of which W14 sherds are described, the sherd’s ware-category could be 
established as a practical certainty. Hence for the purpose of sherd-rec- 
ognition the classifier might be thought of as proceeding by stages, first by 
assuring the practical certainty that a given sherd is wheel-made (and it 
possesses other significant attribute-values), and then by making a ‘sec- 
ondary inference’ that it is of a particular ware-category. There is much 
dubious psychologising in this, but there is a connection between it and 
another role that is played by the occult attribute-variables which is of great 
methodological importance.! 

Though we have spoken of chronological estimation as the purpose for 
which WYA’s typology has been constructed, it is more accurately said to 
be the primary purpose in that this is what guided the selection of the 
attribute-variables in terms of which wares are distinguished. Very roughly, 
we can describe the typology as being so constructed that sherd- 
categorisation in terms of its types will be maximally informative about 
chronology. But that does not mean that other possible uses have been 
wholly ignored. In fact it is to be expected that technological advances will 
eventually permit more accurate chronological estimation, and this will 
render the present typology obsolete so far as its primary use is concerned. 
But the system will not thereby be rendered useless, because classification 
in its terms is potentially informative about other things (see WYA [1986a], 
pp. 6-8). And, the attribute-variables entering into the ware-descriptions 
give indications of what these other things might be. For instance, some 
research currently being planned will reconsider already gathered sherd- 
statistics from the point of view of the information that might be gained 


' As with the observation-inference distinction previously commented on (footnote 1, previous 
page), we hope that our functional-purposive approach can side-step controversial issues 
having to do with cognitive processes involving sensations, mental images, inferences, 
meanings, gestalts, and so forth. 


428 Ernest W. Adams and William Y. Adams 


from them about economic phenomena. The wheel-made vs. hand-made 
distinction is expected to yield information about trading patterns, since 
hand-made vessels were almost never traded while wheel-made ones tend 
to have been traded over long distances. 

The point of the foregoing is that while a typology may have been 
designed with one or a few uses paramount, the typologist normally 
envisions a ‘halo’ of potential secondary uses, of which the attribute-vari- 
ables are indicative. The typologist may even be unclear about his or her 
objectives, and think of the typology as ‘universal’ and informative for all 
purposes (cf. Sabloff and Smith [1969], pp. 280-2; Schiffer [1976], p. 93). 
We believe that this must be an illusion (as do Brew [1946], p. 64, Dunnell 
[1971], p. 115, and Hill and Evans [1972], p. 235), though this controversial 
point will be returned to in section 3.3. However, a final word must be said 
about the general ware-concept in WYA’s system, as against the particular 
ware-concepts that fall under it, like that of being a vessel of type W14. 


2.5 The general concept of a ware in WYA’s system 


Looking ahead to section 3.3, let us note that it is at the level of the general 
type concept much more than at that of the individual type falling under it 
that methodological controversies tend to arise. Thus, in the current bio- 
logical species controversy it is the general idea of a species, as against, say, 
that of Eschscholtzia californica (the California poppy), that is at issue (see 
Holsinger [1984] on this). It could appear strange, therefore, that in most 
of the sherd typologies in field use, including WYA’s system, the general 
concept of a ware is specified in far less detail than are the concepts of the 
particular wares in these systems (for example, see Colton and Hargrave 
[1937], pp. 2-3). In fact, this is easily explained. Ware-descriptions are 
designed to be used by persons applying typologies already in existence, 
while the function of giving a general definition of a system of ware- 
categories, aside from helping users to ‘get the general idea’ of the system, 
is to guide researchers in evolving possible modifications or extensions of 
the system, or even in developing entirely new systems (see Colton and 
Hargrave [1937], pp. 19-22). These are very different sorts of activities, 
and given that the typologist has usually already constructed his or her 
system and aims primarily to describe its application, he or she will naturally 
focus on the details of that and not on the general guidelines followed in 
constructing the system. Additional risks are run in attempting to formulate 
general rules of typology construction. This is something that is easier said 
than done, and it is easy to make mistakes in describing one’s own methods 
(cf. Brew [1946], pp. 44-6, and Hill and Evans [1972]). In Mach’s famous 
phrase, one is to a large extent guided by the ‘tact of the natural investigator’ 
(Mach [1915], p. 285) whose years of experience and reflection in a par- 
ticular field cannot easily be compressed into a few mechanical maxims to 
be handed over to neophytes. But there are a few observations that can be 


Purpose and Scientific Concept Formation 429 


made on the general concept of a ware, and on the processes by which 
particular wares in WYA’s system have come to be distinguished. 

First, given the general goal of developing a typology that will be useful 
for the purpose of chronological estimation, it is to be expected that the 
actual development of the typology will be an evolutionary process involv- 
ing successive-approximations (this point is developed at length in WYA 
and EWA, in preparation). One starts with guesses as to what attributes 
and attribute-combinations are likely to be chronologically significant, and 
then gathers statistics on sherds that have these attributes which are re- 
covered from independently datable proveniences. Most emphatically, one 
does not attempt to be ‘objective’ in the manner recommended by the 
numerical taxonomists (cf. Sneath [1962]) by indiscriminately seeking out 
all possible correlations between all possible attributes. Even if that were 
practically possible (whatever ‘all possible attributes’ might be) there is no 
a priori reason to suppose that the correlations that might be discovered in 
that way would be useful for the purpose of chronological estimation. As 
already suggested, we believe that it is vain to search for a typology that 
will be useful for all possible purposes, and the typologist must have one 
or a few purposes in view at the outset for guidance in the selection of 
attributes to be studied. That is not to say that there are no ‘joints in nature’ 
to be looked for, but what count as the significant natural divisions must 
still be dependent on the typologists’ purposes (cf. Spaulding [1982], p. 
11). 

Given statistics on associations between attributes that appear chron- 
ologically significant a priori, two complementary processes are involved in 
refining the initial attribute selection. These are splitting and lumping.’ In 
the splitting process sherds that were originally classified under a single 
heading are distinguished into two or more types, because it is found 
that these distinctions are chronologically significant. The reverse process 
occurs in lumping, when two or more previously distinguished types are 
found not to differ significantly in their chronological distributions, and so 
they are lumped into one. This happened in certain cases in evolving the 
present 102-ware system from the 27-ware Urtypologie, though obviously 
lumping was much less common than splitting. But the fact that it occurs 
at all is very important, because this runs counter to so called ‘objective’ 
typological methodologies such as those of the numerical taxonomists, 
according to which all associations of attributes are equally ‘valid’ and are 
equally important to distinguish (cf. Clarke [1968], pp. 512-624; Doran 
and Hodson [1975], pp. 158-86; Aldenderfer and Blashfield [1978]). But 
that is because these approaches conceive of typology construction as an 


1 See WYA [1975], p- 88. Paul Teller reminds us that phenomena akin to splitting and lumping 
occur in the evolution of classification systems in physics. Examples include ‘splitting’ 
elements into isotopes (e.g., C12 and C14), or ‘lumping’ hydrogen, deuterium, and tritium 
together simply as Aydrogen. 


430 Ernest W. Adams and William Y. Adams 


end in itself, without reference to the purposes for which such systems are 
constructed. 

We will have more to say about the ‘logic’ of the relation between general 
type-concepts and the particular types that fall under them in section 3.3, 
but there are two further points to make about WYA’s system in this 
connection. One is the obvious one that whatever general rules are for- 
mulated to guide the construction of a typology, the particular type-dis- 
tinctions arrived at in following these rules must depend on facts that are 
not ‘given’ a priori. In the case of WYA’s system these are facts concerning 
the associations that actually obtain between sherd attribute-variables and 
chronology. It follows that knowing the general ware-concept in WYA’s 
system would not permit one to deduce a priori that a particular ware with 
the description of W14 must fall under it. This is directly related to the 
second point. 

Given that the formulation of a general methodology of typology con- 
struction is difficult and that it is something about which it is easy to 
make mistakes, and given that actual typology construction is a matter of 
successive approximations which depends as much on a posteriori associ- 
ations as it does on general rules, it is to be expected that there will only be 
a loose association between general type definitions and the particular type 
concepts that fall under them. This suggests in turn that changes in general 
type-concept definitions should not be expected to result in radical changes 
in the particular type-concepts that fall under them. Where changes are 
found in the latter, as in the evolution from the 27-ware Urtypologie to the 
present 102-ware system, this is more apt to result from new factual 
discoveries, e.g., as concerns chronological associations, than from changes 
in the rules defining the general concept of a ware. This broad methodo- 
logical claim will be returned to in section 3.3, in remarks on the contro- 
versy concerning the biological species concept. But first we must comment 
on the degree to which our conclusions about WYA’s system generalise to 
other scientific concept systems. 


3 GENERALISING TO OTHER SCIENTIFIC CONCEPT SYSTEMS 
3.1 Preliminary remarks 


We shall not pretend that all of the features that we have noted in WYA’s 
system are to be found in equal measure in other scientific typologies, much 
less in scientific concept systems in general. The extent to which they 
generalise is a matter of degree and depends on the feature in question. But 
in the following two sections we want at least to make it plausible that there 
is some general validity to lessons learned from WYA’s system, by citing 
two more examples of systems which exhibit features similar to those 
observed in WYA’s system. These examples are the Mohs Scale of hardness 
measurement, in which we will summarise points made in an earlier paper, 


Purpose and Scientific Concept Formation 431 


EWA [1966], and biological species classification. Hardness will be dis- 
cussed in section 3.2 and biological species will be discussed in section 3.3. 
However, it is desirable first to say a bit more about the purposes and 
associated uses of the scientific concept systems with which we are con- 
cerned, not so much to clarify these problematic ideas as to note com- 
plexities. 

Though it is tautological to say that deliberately developed scientific 
concepts are developed for purposes, nonetheless most well known scientific 
concept systems, unlike WYA’s system, have been evolved through the 
efforts of many individuals, and they are applied by many more. We can- 
not claim identity of purpose on the part of all individuals developing and 
applying these concept systems, and therefore it is questionable whether 
‘the’ purposes or uses of these concept systems can be identified with those 
of their developers and users. Nonetheless we would still urge that useful 
insights into the nature of the scientific enterprise are to be gained by 
considering its concept systems from a functional point of view, though 
their purposes, uses, and functions now become autonomous (in somewhat 
the same way that ‘use for sitting’ is autonomous to the idea of being a 
chair, whatever uses individuals put chairs to). Thus we will shortly suggest 
that useful insights are to be gained into the Mohs Scale system of hardness 
measurement and into biological species classification if we regard these as 
technologies whose applications are designed to be useful for more or less 
well defined purposes, though particularly in the biological species case this 
begs important questions. These will ultimately be left unresolved, but for 
the present we will restrict our inquiry to concept systems that are like 
WYA’s system in the following way. The intended applications of such a 
system involve the acquisition of data that are to be described in terms of 
the concepts of the system, and the data so described are intended to provide 
information of specifiable kinds about things that are extrinsic to the data.’ 
WYA’s system can be conceptualised in this way, since its application 
involves the examination of potsherds (the data), which are described in 
terms of its concepts (e.g., as a W14), and the so described data are used 
to provide information about something extrinsic to the types: namely 
chronology. We will now argue that Mohs Scale hardness measurement 
can be similarly construed, and we will suggest more contentiously in the 
final subsection that it may be useful to regard biological species classi- 
fication in the same way. The following remarks on the Mohs Scale are 
largely a summary of points made in EWA [1966]. 


' This is very broad and is not meant to be precise. The point is to contrast observations 
involving the determination of the applicability of a concept, such as measurements of 
distances, which are done for the purpose of testing theories of those concepts (e.g., geo- 
metrical theories), with observations which are done for ‘extrinsic’ purposes like calculating 
travel-times. Of course, in the more sophisticated sciences one might say that all observations 
may be motivated by either intrinsic theoretical concerns or by extrinsic ones. We would 
argue, however, that philosophy of science has greatly overstressed the intrinsic-theoretical 
motivation. 


432 Ernest W. Adams and William Y. Adams 
3.2 The Mohs hardness scale 


The original version of the Mohs Scale of hardness was described by F. 
Mohs in 1820, and through successive refinements it has evolved into the 
standard scale by which the hardnesses of minerals are measured today 
(though it has ancillary applications such as to ceramics, as its inclusion as 
one of the attribute-variables in the category of FABRIC shows). As this 
suggests, the primary purpose of Mohs Scale hardness measurement is to 
provide information about the properties of minerals, and in fact the most 
common use is to aid in their identification. With caution, we may say that 
mineral identification stands to Mohs Scale hardness as provenience dating 
stands to classification in WYA’s system. For instance, mineral composition 
is extrinsic to the data from which Mohs Scale hardness measures are 
derived (essentially scratch comparisons with standard minerals) in some- 
what the same way that provenience dates are extrinsic to potsherd attri- 
butes. 

Many features of the Mohs Scale are explainable by reference to its 
intended use in mineral identification, of which the most obvious is the fact 
that only objects of hardnesses comparable to minerals have a place on it. 
Though in everyday speech we say that beds are hard or soft, it makes no 
sense to speak of a bed’s hardness on the Mohs Scale. EWA [1966] also 
argues that the ro minerals that have been selected as the standards on the 
Mohs Scale (e.g., calcite near the low end with Mohs Scale hardness 3, and 
quartz near the upper end with Mohs Scale hardness 7) are well chosen in 
the sense that Mohs Scale scratch test comparisons with them will yield a 
maximum of useful information about the mineral content of specimens of 
unknown composition. Similarly, the fact that at present the finest divisions 
that are recognised on this scale are at intervals of o.r reflects the fact 
that while finer subdivisions could be made they would not yield useful 
information about mineral content because specimens of the same mineral 
generally vary in hardness by more than o.1 on the Mohs Scale. Inciden- 
tally, it is significant that specimens of the standard minerals are less variable 
in their hardnesses than most other minerals are. 

There are certain similarities between what could be called Mohs Scale 
hardness descriptions and the ware-descriptions in WYA’s system. While 
the following may not be the last word in Mohs Scale hardness charac- 
terisation, here is one such description, which is quoted from Ford [1947], 
p. 214: 

If the mineral under examination is scratched by a knife blade as easily as calcite 


its hardness is said to be 3; if less easily than calcite but more so than fluorite its 
hardness is 3.5. 


Obviously the description is incomparably simpler than that of W14 in 
WYA’s system, which reflects the fact that Mohs Scale hardness is con- 
ceptually far simpler and easier to master than is ware-classification, but 
there are similarities both in vagueness and in occultness. ‘Scratched by a 


Purpose and Scientific Concept Formation 433 


knife blade as easily as calcite’ is vague, but it is acceptable for most purposes 
because demanding greater precision (e.g., by insisting in the use of scler- 
ometers—-scratch meters) would not yield substantially greater infor- 
mation about mineral content, and to insist on it would be counter- 
productive because of expense in time and money. The attributes of 
being calcite or fluorite may or may not be occult, since it is possible that 
they could be conclusively established with the use of sufficient laboratory 
equipment, but they are certainly occult under field conditions, where one 
takes for granted that specimens exhibiting the recognition signs of these 
standards are calcite and fluorite. 

There is not space to enter into other aspects of the functional analysis 
of the Mohs Scale, and rather than going into further detail we will 
conclude this section with a few remarks on the contrast between this 
approach and the ‘representational’ analysis of this scale as developed in 
current theories of fundamental measurement (cf. Roberts [1979], p. 61). 
According to the representational conception, Mohs Scale hardness values 
must correspond to or ‘represent’ ordinal scratch comparison relations in 
the sense that one mineral specimen should have a higher value than another 
on the Mohs Scale if and only if the first scratches the second when it is 
rubbed over the second one’s surface. While there is something to this, the 
thing to note is that it makes no mention of the use to which Mohs Scale 
measures are meant to be put: namely as indices of mineral content. It is 
made to seem that the primary use of Mohs Scale measurement is to provide 
information not about mineral composition but about scratch relations, as 
though knowledge of these relations were an end in itself. This in turn 
makes it appear essential that the scratch relation should satisfy certain 
empirical laws which are ‘necessary conditions for representability’ (ef. 
Manders [1977]), such as that it should be transitive. Actually scratch 
relations, vague as they are, do not perfectly satisfy these laws, which, 
ironically, led N. R. Campbell to deny that the Mohs Scale really measures 
hardness saying: ‘Accordingly the definition does not lead to a definite order 
of hardness and does not permit the measurement of hardness’ (Campbell 
[1921], p. 128). We would say that given the intended use of Mohs Scale 
hardness determinations for mineral identification there is no reason why 
they should exactly satisfy the transitivity law, and Campbell’s assertion is 
almost a paradigm case of a formal standard being applied inappropriately 
without taking account of the purposes for which measurements are made.' 
1 The work of Adams and Carlstrom [1979] suggests that such defects are not as devastating 

as critics like Campbell or opponents of the interbreeding population approach to biological 
species have thought (cf. comments on the latter in section 3.3). Adams and Carlstrom show 
that any relation that approximately satisfies the laws either of ordering or of equivalence 
relations (e.g., for most triples of individuals, x, y, and x, if x scratches y and y scratches x 
then x scratches x) must itself approximate an ideal ordering or equivalence relation. Hence, 
if the scratching relation satisfies the laws of ordering relations in most instances then 
necessarily a ‘small change’ in the relation must yield one which satisfies these laws in all 


instances. The general significance of ideas about ‘smal! deviations’ from ideal norms is 
something we hope to explore in future publication. 


434 Ernest W. Adams and William Y. Adams 


As a concluding aside, let us comment briefly on the fact that in recent 
years controversy over fundamental measurement has shifted from what 
constitutes ‘genuine’ measurement to what may be done with the results of 
measurements, and what may be said ‘meaningfully’ about them (ef. 
Stevens [1946], and Adams, Fagot and Robinson [1965]). We cannot enter 
into the details of this controversy (see section 1 of EWA [1966] for a 
critique of these views), but again we would claim that both sides consider 
measurement as something that is done for its own sake. This ignores 
the actual extrinsic reasons for which measurements are made, and leads 
in consequence to inappropriate normative strictures. Incidentally, one 
of these is violated in the description of W14, which specifies the average 
hardness of specimens of W14 as 3 on the Mohs Scale. According to 
S. S. Stevens, averaging in the sense of taking an arithmetic mean is not 
a ‘permissible’ statistical operation on measurements of this kind. We 
believe that the question ought to be: does this serve the purposes for 
which these hardnesses are measured? 


3-3 Biological species, and the species controversy 


We will largely concentrate on the species controversy, but first we will 
make a few comments on the conception of biological species classification 
as a technology whose application involves the acquisition of data described 
in biological species terms, and which provides a specifiable kind of extrinsic 
information. It is plausible enough that sorting by species is a very complex 
procedure which involves ‘recognition-signs’ of attributes, many of which 
are vague and more or less occult. What is less evident is that this sorting 
is expected to yield specifiable kinds of information that are extrinsic to the 
attributes themselves. 

What do we learn of a flower in establishing that it is of the genus 
Eschscholtzia, species californica (the California Poppy)? The answer 
“everything about the flower that is true of it in virtue of its being of this 
species” is true but uninformative. It is more informative to be told that 
determining the species tells us about the flower’s genetic makeup and 
thereby about all of the things that are determined by this makeup— 
morphology, probable life-history, reproductive behaviour, and so on. 
Probably information about nutritional and medicinal value also falls under 
this heading. The question is: is it reasonable to think of species distinctions 
as though they were made in such a way that determinations of species will 
yield maximal information about morphology, reproduction, and so on?! 


1 An alternative possibility is that species theorists tend to be evolutionary biologists, who are 
tacitly agreed that the purpose of biological research is to clarify the evolutionary picture 
and who construct their concept systems accordingly. If true, there would be a certain irony 
in the fact that Linnaeus himself was not an evolutionist. As Larson says ‘Widespread 
acceptance of the (Linnaean apecies) concept seems to indicate continued belief in the version 
of creation found in Genesis, . . .” (Larson [1970], p. 94). 


Purpose and Scientific Concept Formation 435 


This is a very complicated question which we can hardly do more than raise - 
here. However, we would suggest the following as a plausible claim about 
the Linnaean system of botanical classification in terms of the reproductive 
organs of plants. That is that this system is in fact more informative about 
the above-mentioned things than were the morphological classifications of 
the herbalists which it superseded. Why that should have been so is another 
matter. Conceivably this might be because the reproductive organs are 
related to reproduction, and reproduction is more than mere generation: it 
involves reproducing the individual. If reproduction were mere generation, 
and all that could be said was that the seeds of one living organism would 
grow into another living organism probably not resembling the first in 
either form or life-history, then indeed data concerning the parentage of 
individuals would give us little useful information. But that is not so, and 
the fact that within limits the individual reproduces the species is what 
gives reproductive behaviour the meaning it has in biological classification. 
But these highly speculative considerations cannot be pursued further here, 
and we conclude with some sketchy comments on how the current contro- 
versy over the biological species concept appears from the point of view of 
our approach to scientific classification. 

We suggested in section 2.5 that the primary function of definitions of 
general type concepts such as species or ware is that of laying down guidelines 
for extending or modifying systems of individual type concepts. Further- 
more, given the fact that the evolution of type concept systems is an evol- 
utionary and approximative process, one expects only a rather loose con- 
nection between general type definitions and the individual type concepts 
that fall under them. For instance, one would not expect a change in the 
definition of biologtcal species to lead to a radical change in the description 
of Eschscholtata californica, or in the laboratory and/or field techniques that 
are used to identify this plant. In fact one could argue that the shoe is on 
the other foot: it is the individual species concept that is primary and any 
changes that are proposed at the more abstract level of species in general 
must not radically tamper with the individual species distinctions that are 
now recognised. But though entrenched individual species concepts may 
be more or less impervious to changes in general species definitions, the 
latter may play a significant role in guiding extensions to include new 
species and even revisions in the descriptions of entrenched species. It is 
to be expected that this should be controversial, and our next comments 
pertain to that. 

For the moment let us assume that the methodologists of biology are in 
agreement on the goals of biological classification, though this will be 
questioned below. Given this assumption there should also be agreement 
on what constitute good rules for describing new biological species. 
Roughly, these should be ones which, in combination with the facts, lead 
to characterisations of individual species concepts that best serve the pur- 
poses for which these concepts are designed (remember that the individual 


436 Ernest W. Adams and William Y. Adams 


types arrived at in classification systems depend as much on facts established 
a posteriori as on general rules for type-characterisation). On this view there 
could well be disagreements as to what general definitions and rules would 
lead to ‘best’ individual species concepts. Certain aspects of the current 
species controversy could be regarded as involving a particular kind of 
disagreement of this sort. 

In an early version of his interbreeding population conception Ernst Mayr 
formulated this idea as follows: 


The word species is likewise such a relational term. It separates interbreeding 
populations from all others (Mayr [1948], p. 371). 


One of the controversial aspects of this approach has to do with the fact 
that the interbreeding population idea involves the factual assumption that 
interfertility is an equivalence relation (i.e., it is transitive and symmetric). 
Recent criticism (cf. Hull [1970], Kitcher [1984]) shows that this assump- 
tion is at best ‘approximately true’, and this can be viewed as a defect in 
Mayr’s definition of species, regarded as implicitly giving rules for the 
characterisation of individual species. Mayr’s rules cannot be straight- 
forwardly applied in such a way as to lead to unambiguous characterisations 
of individual species. This type of ‘structural defect’ is common to many 
methodologies of concept formation, including theories of quantitative 
measurement. They have certain factual presuppositions, and the fact that 
these are only ‘approximations’ renders the rules either inapplicable or 
requires that further rules, often of an ad hoc character, should be for- 
mulated in order to ‘deal with messy realities’. Criticising rules of concept 
formation because of structural defects like the one just noted is straight- 
forward, and if that were all that was involved in the current species 
controversy there would be little more to be said.! However there is another 
aspect of the controversy that is much more difficult to deal with, which 
we will comment on in closing. 

It is not uncommon to hold that certain classifications are ‘objective’ and 
reflect ‘real divisions in nature’, and to criticise others as ‘artificial’ and 
lacking in objective significance. Thus, Mayr says of his species charac- 
terisation that “The gap between species is well defined and has objective 
reality” (Mayr [1949], p. 372, our italics), What is this ‘objectivity’ and 
why should it be thought desirable that scientific classification systems 
should possess it? Though it is an open question which we hope to discuss 
in future publication whether the idea of objectivity can be accommodated 
within our approach, we suggest that much discussion relating to it is 


1 See WYA and EWA Archeological Typology and Practical Reality, in preparation, for further 
elaboration of this point. It is worth noting that the work of Adams and Carlstrom [1979] 
commented on in footnote 1, page 15 shows how the criticism that the relation of inter- 
breeding isn’t a perfect equivalence relation could be met. If the relation approxtmately 
satisfies the laws of equivalence relations (t.e., those laws hold in most instances) then a small 
change in its extension will transform it into one which exactly satisfies these laws, and this 
in turn can be used as the basis for defining ‘interbreeding species’. 


Purpose and Scientific Concept Formation 437 


confused, and that it might help to clarify matters if the purposes which 
are served by biological classification were taken more syatematically into 
account. 

Obviously any classification system will involve both artificial and ‘objec- 
tive factual’ components. All man-made and deliberately developed concept 
systems are ipso facto artificial, On the other hand, to the extent that 
scientists follow rules in characterising individual concepts and the concepts 
to which they are led in following these rules depend on facts that are 
beyond their control, these concepts are not defined by wholly arbitrary 
‘conventions’. Thus, whatever the facts are, Mayr’s suggestion that the gap 
between interbreeding populations has objective reality at least makes sense. 
If it were true then to that extent the species concept that Mayr proposes 
to base on it would also have an element of objectivity. 

The more controversial claim is that the rule that species correspond to 
interbreeding populations is ‘natural’ in contrast to possible alternative 
‘artificial’ rules for distinguishing species such as might rely on mor- 
phological features like the colours of flowers. If any sense is to be made of 
such a suggestion, it seems to us that it must be connected with the idea 
that there is one ‘right’ mode of biological classification which will serve all 
purposes. As to this, we agree with Holsinger that “. . . there is no reason 
to believe that the circumscription that is appropriate for one investigation 
will be appropriate for another” (Holsinger [1984], p. 301; much of what 
we say in the present section is in close agreement with Holsinger’s views). 
But we take the widely held belief that there is just one ‘right’ way to 
classify the entities or phenomena of a field to be an indication that the 
methodologists of classification theory are not clear about the goals of 
biological classification. As W. S. Jevons put it: “In approaching the ques- 
tion how a given group of objects may be best classified, let it be remarked 
that there must generally be an unlimited number of modes...” and“... 
we must not attribute exclusive excellence to any one mode of classification” 
(Jevons [1877], pp. 677 and 722). We suspect that failure to realise this may 
as much as anything else be what is at the root of the species controversy. 


REFERENCES 
ADAMS, E. W. [1966]: ‘On the Nature and Purpose of Measurement’, Synthese, 16, pp. 125— 
69 


ApaMs, E. W. and CARLSTROM, I. F. [1979]: ‘Representing Approximate Ordering and 
Equivalence Relations’, Journal of Mathematical Psychology, 19, pp. 182-207. 

ADAMS, E. W., FAGOT, R. F. and ROBINSON, R. E. [1965]: ‘A Theory of Appropriate 
Statistica’, Psychometrika, 30, pp. 99-127. 

Apams, W. Y. [1961]: ‘Archeological Survey of Sudanese Nubia’, Kush, 9, pp. 30-43. 

ADAMS, W. Y. [1962a]: ‘Archeological Survey on the West Bank of the Nile’, Kush, 10, pp. 
62-75. 

ADAMS, W. Y [1962b]: ‘An Introductory Classification of Christian Nubian Pottery’, Kush, 
10, pp. 245-88. 

ADAMS, W. Y. [1964]: ‘Sudan Antiquities Service Excavations in Nubia: Fourth Season, 
1962-63’, Kush, 12, pp. 216-50. 


438 Ernest W. Adams and William Y. Adams 


ADAMS, W. Y. [1965]: ‘Sudan Antiquities Service Excavations at Meinarti’, Kush, 13, pp. 
148-76. 

ADAMS, W. Y. [1970]: “The University of Kentucky Excavations at Kulubnarti, 1969’. In 
Erich Dinkler (ed.), Kunst und Geschichte Nubiens in Christlicher Zeit, pp. 141~54. Aurel 
Bongers: Recklinghausen. 

ADAMS, W. Y. [1975]: ‘Principles and Pragmatics of Pottery Classification: Some Lessons 
from Nubia.’ In J. S. Raymond, B. Loveseth, C. Arnold and G. Reardon (eds.), Primitive 
Art and Technology, pp. 81-91. Archeological Association of the University of Calgary: 
Calgary. 

ADAMS, W., Y. [1986a]: Ceramic Industries of Medieval Nubia. Memoirs of the UNESCO 
Archeological Survey of Sudanese Nubia, Vol. 1. University Press of Kentucky: Lex- 


ington . 

ADAMS, W. Y. [1986b]: ‘From Pottery to History: the Dating of Archeological Deposits 
by Ceramic Statistics’, Wissenschaftliche Zeitschrift der Humboldt- Universität zu Berlin, 
Geisteswissenschaftliche Reihe, 35 Jahrgang, Heft 1, pp. 27-45. 

ADAMS, W. Y., n.d.: ‘Ceramics and Archeological Dating at Qasr Ibrim, Egypt.’ Paper 
read at 86th General Meeting of the Archeological Institute of America, Toronto, 29 
Dec. 1984. 

ADAMS, W. Y. and ADAMS, E. W.: Archeological Typology and Practical Reality. In prep- 
aration. 

ADAMS, W. Y., ALEXANDER, J. A. and ALLEN, R. [1983]: ‘Qasr Ibrim 1980 and 1982’, Journal 
of Egyptian Archeology, 69, pp. 43-60. 

ALDENDERFER, M. S. and BLASHFIELD, R. K. [1978]: ‘Cluster Analysis and Archeological 
Classification’, American Antiquity, 43, pp. 502-6. 

Brew, J. O. [1946]: The Archeology of Alkali Ridge, Southeastern Utah. Papers of the Peabody 
Museum of American Archeology and Ethnology, Harvard University, Vol. 21. 

CAMPBELL, N. R. [1921]: What is Science? Dover edition, 1951. 

CLARKE, D. L. [1968]: Analytical Archeology. Methuen: London. 

COLTON, H: S. [1953]: Potsherds. Museum of Northern Arizona Bulletin 25. 

CoLTON, H. S. and HARGRAVE, L. L. [1937]: Handbook of Northern Arizona Pottery Wares. 
Museum of Northern Arizona Bulletin 11. 

Doran, J. E. and Hopson, F. R. [1975]: Mathematics and Computers in Archeology. Cam- 
bridge: Harvard University Press. 

DUNNELL, R. C. [1971]: Systematics in Prehistory. New York: The Free Press. 

Ford, W. E. [1947]: Dana’s Textbook of Mineralogy, 4th edition. New York, 1947. 

HILL, J. N. and Evans, R. K. [1972]: ‘A Model for Classification and Typology’. In David 
L. Clarke, (ed.), Models in Archeology, pp. 231-273. London: Methuen. 

Hopson, F. R. [1982]: ‘Some Aspects of Archeological Classification.’ In R. Whallon and 
J. A. Brown (eds.), Essays on Archeological Typology, pp. 21-9. Evanston, Illinois: Center 
for American Archeology Press. 

HOLSINGER, K. E. [1984]: “The Nature of Biological Species,’ Philosophy of Science, 51, pp. 
293-307. 

HULL, D. L. [1970]: ‘Contemporary Systematic Philosophies’, Annual Review of Ecology and 
Systematics, 1, pp. 19-53. 

Jevons, W. S. [1877]: The Principles of Science; a Treatise on Logic and Scientific Method. 
Dover edition, 1958. 

JOHNSON, W. E. [1921]: Logic, Part I. Cambridge: Cambridge University Press. Dover 
edition, 1964. 

KITCHER, P. [1984]: ‘Species’, Philosophy of Science, 51, pp. 308-33. 

Larson, J. L. [1970]: Reason and Experience. Berkeley: University of California Press. 

Macy, E. [1908]: Die Mechanik in Ihrer Enturicklung Historische-Kritisch Dargestellt, 6th ed. 
English translation, The Science of Mechanics, translated by T. J. McCormack. La Salle: 
Open Court, 1915. 

MANDERS, K. L. [1977]: Necessary Conditions for Representability, Electronics Research Lab- 
oratory, University of California, Berkeley. 

Mayr, E. [1949]: “The Species Concept: Systematics versus Semantics’, Evolution, Vol. 4, pp. 
371-2. 

Peacock, D. P. S. [1970]: ‘The Scientific Analysis of Ancient Ceramics’, World Archeology, 
X, Pp. 375-89. 

PLUMLEY, J. M. and ADams, W. Y. [1974]: ‘Qasr Ibrim 1972’, Journal of Egyptian Archeology, 
60, pp. 212-38. 


Purpose and Scientific Concept Formation 439 


PLUMLEY, J. M., ADAMS, W. Y. and Crowroot, E. [1977]: ‘Qasr Ibrim 1976’, Journal of 
Egyptian Archeology, 63, pp. 29-47. 

ROBERTS, F. E. [1979]: Measurement Theory. Reading: Addison-Wesley. 

SABLOFF, J. A. and SMITH, R. E. [1969]: “The Importance of Both Analytic and Taxonomic 
Classification in the Type-Variety System’, American Antiquity, 34, pp. 278-85. 

SCHIFFER, M. B. [1976]: Behavioral Archeology. New York: Academic Press. 

SHEPARD, A. C. [1965]: Ceramics for the Archeologist, 5th Printing. Carnegie Institution of 
Washington, Publication 609. 

SNEATH, P. H. A. [1962]: The Construction of Taxonomic Groups. In G. C. Ainsworth and 
P. H. A. Sneath (eds.), Microbial Classification, pp. 289-322. Cambridge: Cambridge 
University Press. 

SPAULDING, A. C. [1982]: ‘Structure in Archaeological Data: Nominal Variables.’ In Robert 
Whallon and James A. Brown (eds.), Essays on Archaeological Typology, pp. 1-20. 
Evanston: Center for American Archaeology Press. 

STEVENS, S. S. [1946]: ‘On the Theory of Scales of Measurement’, Science, 103, pp. 667-80. 

SUPPES, P. [1957]: Introduction to Logic. Princeton: D. Van Nostrand. 

Tuomas, D. H. [1972]: ‘The Use and Abuse of Numerical Taxonomy in Archeology’, 
Archeology and Physical Anthropology of Oceania, 7, pp. 31-49. 

VAN FRAASSEN, B. C. [1980]: The Scientific Image. Oxford: The Clarendon Press. 

WHALLON, R. [1972]: ‘A New Approach to Pottery Typology’, American Antiquity, 37, pp. 
13-34. 


APPENDIX A ware description in the Nubian pottery typology of W. Y. Adams 


The following description of the TERMINAL CHRISTIAN DECORATED WHITE WARE, which is 
complete except for a final page with drawings of typical forms and decorations of vessels of 
this type, is drawn from WYA [1986a]: pp. 512-13. 


FAMILY N Group N.VII 
Ware Wri4 ‘TERMINAL CHRISTIAN DECORATED WHITE WARE 

A rather heavy matte white ware decorated in Style N.VII: the most distinctive ware of the 
Terminal Christian Period. It is presumably evolved from Wares W15 and W16, but is 
distinguished from them by its bolder and simpler decorative style and by a distinctive, rather 
heavy group of vessel forms. 


CONSTRUCTION: Wheel-made. 


FABRIC: Paste: Nile mud. Density: medium. Texture: medium. Color: tan, light brown, or red- 
brown shading to darker, often purplish core (typical Munsell signatures 2.5YR 4/5, 2.5YR 
6/6). Carbon streak: occasional, seldom dark. Hardness: generally medium soft (Mohs’ values 
2.5 to 4.5, av. 3.0). Fracture: medium. Solid temper: fairly abundant fine sand, black and red 
fragments. Organic temper: none seen. Variability: apparently low. Remarks: same fabric as in 
Ware R28. 


SURFACES: Covering: medium thick, soft slip. Finish: matte or sometimes lightly polished. 
Texture: usually rather chalky or gritty. Configuration: level, rotation marks not prominent on 
interiors. Variability: surfaces may be matte or lightly polished; never glossy. 


Forms: Principal forms: cups, plain bowls, vases (Fig. 284). Other forms: goblets, footed bowls, 
lids, jars (Fig. 284). Forms not illustrated: Ag, Azo, A23, D44, D47, F27a, Q6. Doubtful forms: 
C12, C34, C42, F6. Vessel sizes: mostly medium. Rims: rounded, frequently thickened. Bases: 
ring base in footed bowls. Wall thickness: generally notably thick, especially in larger vessels 
(7 to 13 mm av. 9.6). Execution: generally fairly precise. Variability: apparently low. 


Co.ors: Natural color: tan, light brown or red-brown (typical Munsell signatures 2.5YR 4/5, 
2.5YR 6/6). Slip: shades from cream to pale pink, yellow, tan, or orange (typical Munsell 
signatures 7.5YR 7/8, 5YR 6/8). Interior usually cream or white. Primary decoration: very 
dark brown to dense black (typical Munsell signatures 10R 3/1, 10R 3/2). Secondary decoration: 
Medium to dark red (typical Munsell signatures 2.5YR 4/6, 10R 4/6). Common as rim bands 


440 Ernest W. Adams and William Y. Adams 


or spacer bands in larger vessels. Rim stripe: usually broad red; occasionally narrow black. 
Variability: high variability in slip color, but fairly uniform in any given vessel. 


PAINTED DECORATION: Frequency: usual. Principal style: N.VI1. Other styles: I1, V. Most 
common elements: rim stripes, borders, friezes. Other elements: plain body stripes, radials. 
Exterior program: most commonly a single broad frieze; leas often a single narrow border; very 
occasionally a frieze with a narrow border above it. Interior program: not common. Most often 
a simple radial design extending to the vessel rim, without a surrounding border or frieze. 
Execution: fairly precise. Delineation: bold. Variability: apparently low. 


RELIEF DECORATION: None. 


APPRAISAL: Material: not common, but numerous whole vessels collected from Diffinarti. 
Adequacy description: probably incomplete. Vartability: apparently low except in regard to 
slip color. Temporal variation: not known. Geographical variation: probably none; all made at 
one place. Intergradations: possibly with R28 and with predecessors W15 and W16; possibly 
also with companion white Ware W18. Diagnostics: matte white ware decorated in Style N. VII; 
peculiar group of vessel forms (shared with Ware R28). Problems: material insufficient; 
range and center of manufacture not known. 


SIGNIFICANCE: Earliest appearance: 1250 AD. Main period of manufacture: not determined. 
Continued use: to 1600 AD. Persistence of sherds: not determined. Archeological contexts: dom- 
estic refuse. Area of distribution: identified from Qasr Ibrim through Batm el Hajar, wider 
distribution not determined. Center of production: not determined; presumably made in the 
same place as R28. Frequency: not common. Relationships: presumably an outgrowth of Wares 
Wis and W16 in Group N.IV. Companion Ware R28 is a red-slipped counterpart. There is 
no successor ware. Associations: see group description. Index clusters: LCa2 (2%), TC (2%). 


REFERENCES: Vessel photos: Adams [1970], pl. 64; Monneret de Villard [1957], IV, pl. CCI, 
nos. B, D; pl. CCIII; Schneider [1970], pl. 35; Van Moorsel et al. [1975], pl. 35; Villa Hugel 
[1963], Kat. 486. Sherd drawings: Monneret de Villard [1957], IV, pl. CXCII, no. 71; pl. 
CXCI, no. 94. 


University of California, Berkeley 
University of Kentucky 


Brit. J. Phil. Sci. 38 (1987), 441-480 Printed in Great Britain 441 


The Status of Popper’s Theory of 
Scientific Method 


by ROBERT NOLA 


Introduction 

Scientific Method as a Second-Order Discipline 

Popper’s Methodological Rules for Sctence 

Popper’s Anti-Naturalistic View of Methodology 

Popper’s Conventionalist Meta-Methodological Critique of Methodologies 
Objective Progress in Theories of Method 

An Empirical Basis for Popperian Methodology 

Conclusions 


WGN Am bh W DN 


r INTRODUCTION 


In his preface to the first English edition of The Logic of Scientific Discovery 
(LSD) Popper tells us: ‘There is no method peculiar to philosophy’ (LSD, 
15, italics in original).! But on the next page Popper concedes: 


And yet I am quite ready to admit that there is a method which might be described 
as ‘the one method of philosophy’. But it is not characteristic of philosophy alone; 
it is, rather, the one method of all rational discussion, and therefore of the natural 
sciences as well as of philosophy. The method I have in mind is that of stating one’s 
problem clearly and of examining its various proposed solutions critically (LSD, 
16). 


We are not given a more precise characterisation of this method than the 
almost contentless directive to examine proposed solutions critically and 
rationally. However the italicised words suggest a name for this method, 
viz., ‘critical rationalism’ (‘CR for short); it remains to be seen whether 
CR can be given a less bland characterisation as a method which governs 
the evaluation of theories in areas as disparate as science and philosophy. 
In other writings Popper is also at pains to deny that there is such a thing 
as scientific method.” But this turns out to be a misleading way of saying 


Received November 1986 


1 Popper [1959]. This work will be referred to as LSD in the text, a following number being 
a page reference. 

2 For example, see the 1956 Preface to Popper [1983] entitled ‘On the Non-Existence of 
Scientific Method’, pp. 5-8. In this preface Popper sets out the senses in which he thinks 
there is no scientific method. However after some brief comments on criticism as ‘arguing 
with others’ Popper tells us that there is a sense in which he thinks there is such a thing as 
scientific method: ‘I believe that the so-called method of science consists in this kind of criticism’ 
(p. 7). 


442 Robert Nola 


that a number of views as to what this method is are ill-founded. If there 
is method in both philosophy and science it is the method of CR. 

Whether or not the LSD contains a theory of method for philosophy it 
does contain a theory of method for science. Right from the opening remarks 
of the first chapter Popper says that his task is ‘. . . to analyse the method 
of the empirical sciences’ (LSD, 27). More strongly, Popper claims that ‘... 
empirical science may be defined by means of its methodological rules’ 
(LSD, 54).! Despite some ambiguity in a few places (including the English 
title of LSD), these methodological rules are intended to apply entirely 
within the context of justification and not the context of discovery. By 
chapter II of LSD it emerges that these rules are norms which are to guide 
the way we ought to treat statements if they are to be deemed scientific and 
that these rules are constitutive of the scientific enterprise. They not only 
demarcate the scientific from the non-scientific but they also tell us how 
we ought to treat those statements that fall into the box labelled ‘scientific’. 

What are these rules? Some will be listed in section 3. More importantly, 
what is the status of these rules? Of course, like all rules or norms they have 
prescriptive force. However, it turns out that Popper views these rules 
as part of a second-order critical tradition distinct from the first-order 
statements to which they apply. This will be discussed in section 2. It will 
also be seen that Popper rejects the view that these rules have an a priori 
status somewhat like the status that many (but not all) allege for the rules 
of logic. He also rejects the view that these rules have an empirical status 
(see section 4). In the LSD these rules are viewed as conventions—and 
this despite Popper’s rejection of conventionalism as a theory of scientific 
method (see section 5). However in post-LSD writings Popper veers in the 
opposite direction maintaining a view of methodology which gives these 
rules a quasi-empirical character (see section 7). It will be argued that even 
though Popper has raised important questions about meta-methodology 
none of his answers is satisfactory or sits happily with his methodological 
requirements for science. 


2 SCIENTIFIC METHOD AS A SECOND-ORDER DISCIPLINE 


Concerning the ancient Greeks, Popper says in his paper “Towards a 
Rational Theory of Tradition’:? 


My thesis is that what we call ‘science’ is differentiated from the older myths not 
by being something distinct from a myth, but by being accompanied by a second- 
order tradition—that of critically discussing the myth. Before, there was only the 
first-order tradition. A definite story was handed on. Now there was still, of course, 


i In LSD, 50 Popper proposes ‘. . . that empirical science should be characterized by its 
methods, by our manner of dealing with scientific systems: by what we do with them and 
what we do to them. Thus I shall try to establish the rules, or if you will the norms, by 
which the scientist is guided when he is engaged in research or in discovery, in the sense 
here understood’, 

? Popper [1963], p. 127. 


The Status of Popper’s Theory of Scientific Method 443 


a story to be handed on, but with it went something like a silent accompanying text 
. of a second-order character: ‘I hand it on to you, but tell me what you think of it. 
Think it over. Perhaps you can give us a different story.’ This second-order tradition 
was the critical or argumentative attitude. It was, I believe, a new thing, and it is 
still the fundamentally important thing about scientific tradition. 


Our first-order traditions will include myths and stories about a whole host 
of matters, e.g., dreams, the motion of bodies, the origin of life, ete. Many 
of these will have a long history. For example, there is the sequence of 
myths or stories about dreams from Homer, Aristotle and the Bible to our 
modern myth-makers such as Freud and Jung and the ‘computer’ theories 
of Christopher Evans and Francis Crick;! there is the sequence of stories 
about how bodies move from Aristotle, via Kepler, Galileo and Newton, 
to Einstein; and so on. 

What makes one of these first-order traditions scientific? Popper’s remark 
above creates the impression that it is not anything intrinsic to the story 
being told. A story-teller in a scientific tradition may make a single state- 
ment, or a number of statements which can be conjoined together and 
which exhibit enough ‘unity’ to be called a theory. The single statement, 
or the conjunction, may exhibit certain logico-epistemological properties 
such as verifiability, falsifiability, etc. But this would not make the single 
statement, or the conjunction, scientific. Only if the story is accompanied 
(intentionally or not) by the second-order tradition of critical discussion is 
it scientific; unaccompanied it is unscientific. 

This highlights an ambiguity in Popper’s concept of what it is to be 
scientific. On one view statements are said to be scientific if they possess 
some logico-epistemological property such as testability or non-zero degree 
of falsifiability. The last of these Popper endorses as the mark of the 
scientific character of a statement in a number of places.” But in the passage 
just quoted, and in the remarks quoted in section 1, Popper emphasises 
another view. What makes a theory (myth, story) scientific is the critical 
method that accompanies, and is applied to, a theory and not just some 
logico-epistemological property the theory possesses (though these proper- 
ties will have to hold of the theory if the critical method is to be applied to 
it). 

But if it is the second-order critical tradition accompanying a first-order 
story that makes the first-order story scientific then we arrive at an odd 


' Concerning the latter “computer” theories see Evans [1983] especially Part Three, and 
Crick and Mitchison [1983]. 

? Thus in Popper [1963], p. 37 there occurs the following italicised remark: ‘One can sum up 
all this by saying that the criterion of the scientific status of a theory is its falsifiability, or 
refutability, or testability.’ And in LSD, pp. 40-1 we find: ‘. . . the falsifiability of a system 
is to be taken as a criterion of demarcation. In other words, I shall not require of a scientific 
system that it shall be capable of being singled out, once and for all, in a positive sense; but 
I shall require that its logical form shall be such that it can be singled out, by means of 
empirical tests, in a negative sense: it must be possible for an empirical scientific system to be 
refuted by experience.’ 


444 Robert Nola 


consequence: the same statement can be both scientific and unscientific. If 
the story-teller making the statement is also a member of a second-order 
critical tradition then the statement is scientific; but if he is not a member of 
a second-order tradition then the statement is unscientific. Thus Putnam’s 
example: ‘if anyone puts a flour sack on their head and raps the table 99 
times then a demon will appear’! has the logico-epistemological property 
of being strongly falsifiable and therefore, on the first view, is scientific 
(granting Popper’s demarcation criterion). However, on the second view, 
its scientific status depends on whether or not the person making the claim 
is, or is not, a member of a second-order critical tradition. What is scientific 
then becomes unacceptably relative to individuals, or social groups, and 
their membership of a critical tradition. 

This problem can be cleared up by distinguishing the diferent kinds of 
item that can have the predicate ‘scientific’ applied to them, viz., people 
and their attitudes, statements or theories or stories and, finally, methods. 
The following definitions of ‘scientific’ as applied to each of these items are 
tentatively suggested. First, we can say that a person, or a community, has 
a scientific attitude if and only if they have a second-order critical tradition, 
i.e., they apply, intentionally or not, the canons of some scientific method 
to their first-order theories (stories). Second, we can say that a statement 
(theory, ete.) is scientific if and only if it has some logico-epistemological 
property specified by some theory of method. This makes the scientific 
status of a statement (theory, etc.) relative to a theory of method but inde- 
pendent of whether or not anyone entertaining the statement has applied 
the theory of method to it. Both of these definitions make mention of 
scientific method. What makes a method scientific can in turn be filled out 
according to the various accounts that are currently proposed in the litera- 
ture (some of which will be mentioned shortly). 

When the second-order critical method is applied to the first-order 
tradition usually a definite historical sequence of stories is generated, e.g., 
in the case of motion there is the long sequence of distinct stories from 
Aristotle up to the latest story today. Likewise, the second-order tradition 
exhibits a historical sequence of stories about the critical practices we have, 
or ought to have, applied to the first-order stories, Aristotle’s Organon 
probably being the first story at the beginning of the historical sequence. 
The first-order tradition may be accompanied ‘silently’ (as Popper puts it) 
by some distinctive second-order critical method. But if a critical method 
is intentionally applied, or if a critical method is conjectured to have been 
applied, the second-order tradition comes alive as a sequence of distinct 
stories in its own right about the critical methods we apply, or are said to 
have applied, to the stories at the lower level. (The phrase ‘said to have 
applied’ is used deliberately because, as the case of Newton especially 


1 See Putnam [1981], p. 197. 


The Status of Popper’s Theory of Scientific Method 445 


shows, the officially-announced theory of method employed may not have 
been the one actually employed as later historical research reveals.) 

In the paper ‘Towards a Rational Theory of Tradition’ Popper charac- 
teristically warns us against a certain kind of second-order story (which 
could be labelled ‘naive inductivism’, the accumulation of observations to 
which induction is applied to arrive at laws and theories) and recommends 
another second-order story (his own falsificationist methodology which he 
tends, on occasions, to identify with the second-order critical tradition 
itself!). Earlier, in LSD, Popper had identified three distinct theories of 
scientific method in our second-order critical tradition, viz., naive induc- 
tivism, conventionalist methodology, and falsificationist methodology. 
Lakatos has developed his own story at this level, viz., the methodology of 
scientific research programmes. Other methodologies are also currently 
advocated. These include: Bayesian theories of method, this being a genus 
of which there are several species; Laudan has proposed a methodology of 
scientific research traditions which emphasises the role of problem-solving; 
in his methodological moments Feyerabend has advocated a theory of 
method, methodological pluralism, which has some analyses of historical 
episodes to its credit; and so on. The proliferation of second-order critical 
methods which may be applied to first-order theories can be represented 
in the accompanying diagram as a distinct level of theorising. On top of 
this is added a third level—that of meta-methodology. 

Level I contains the historical sequences of scientific theories that have 
been proposed concerning each domain of phenomena. It is here that 
scientific progress is to be found either in the form of increasing veri- 
similitude, or greater predicative capacity, or whatever. 

Level II contains the various theories of critical method that have been 
proposed, i.e., the theories of scientific method or methodologies of science. 
In what sense are they at a distinct level? One uncontroversial sense is that 
in which theories of method are said to be about, or have as their subject 
matter, substantive scientific theories at level I. Another more controversial 
sense of ‘level’ is that theories of method are logically independent of 
empirical theories; that is, the methods we employ to choose between 
scientific theories at level I are not dependent in any way on those very 
level I theories themselves. Popper clearly supports the levels picture in 
this sense as well. But, in contrast, a number of philosophers’ have argued 
for the theory-dependence of theories of method, t.e., they allege that the 
application of level II methods to level I theories depends (in some as yet 
unspecified way) on features of the very scientific theories they are to assess 
and that no level II theory of method is free from some input from empirical 
theory. 


1 The theme of the theory-dependence of method has been taken up in a number of ways, 
¢e.g., Feyerabend [1978], Part One, § 2; Shapere [1980], Boyd [1984] and Putnam [1981], 
pp. 188-99. 


446 Robert Nola 


9ous108 





‘24a qemnonied g zo zuawdojaaəp 
‘+ aury 6° + aoda ‘tet Suoimany OAD doy ‘°° ‘apo stTy Unon ay} JO sduenbas PEOLIO}sTY IYL 
“1+ Song ‘eusag ‘°° “Bunf ‘pnayy <>’ * ‘apoqstry ‘°° * orqa Jawo suq SMONIPDS [, 49PA()~IS ANT 








I THAT 





(puequiedag) wasremyg [eoojopoysepy (01) (uBpNe’]) suoMIpely, yYoreasay oyMUsIOG (6) (803e7eT) 
AUSW (8) wisyeuonusaucD (4) (s9ddog) wsuonvoyispey (9) 
eaten (H) te ‘samy s,uoyman (£) «+ + “97a ‘oupnBoy BINON (Z) `- ‘uounsic— 8,apoqstry (1) 


‘210 
AZojopoujsu-waw soye (£) 
ABOTOPOUIIUI-Blow ISIUOREBOISTE T (Z) 
welemjen (1) 

ABojopoyzau-viau fo sooayy pooturdurgy 










satdo[opoyiayA] “2'3 


‘IJa POPIN INUIS 

Jo Sanoa], “a3 

spoypo uersaABg (S) wstanonpu] poysapy IPIRMO fo 
SuONIpDL], 49psCQ-puovag 


IT THADT 





‘24a 
tnslTeUspusosuBsy, (£) 

wsoBog (z) 

(qST Jo s9ddog) wsmeuonuaauop (1) 
:ASo[Opoyiout-ejau Jo sorsz09y} moud y 


poyenpeaa 
Aeon əq ueo 

II IPA? 38 poyeur jo 
sənosyy Aqaiaym SPON 
Adojopoyteu~ojapy 


LU THADT 


The Status of Popper’s Theory of Scientific Method 447 


Popper intends the levels picture in another sense as well. He (along with 
others such as Lakatos) is an advocate of the view that there is one and only 
one theory of method to be discovered and that once it has been discovered 
it is to apply universally to all theories at all times at level I. Popper does 
admit that there can be growth in our theories of method towards the one 
correct method (this is discussed in section 6). Thus there can be genuine 
discoveries in the theory of method (i.e., one might think here of recently 
developed methods for evaluating statistical hypotheses). Our theories of 
method, in so far as they incorporate features of the one correct method, 
will then have made correct assessments of theories at level I. He would 
not admit that there may be distinct theories of method applicable to 
different level I theories at different times. Nor would he admit that there 
can be distinct theories of method (within the context of justification) for 
distinct subject matters. Thus there are not distinct methods for the natural 
and the humanistic sciences; rather, there is unity of method across all 
sciences.‘ 

What of level III? Should we even admit the existence of meta-metho- 
dology at level III? Clearly if we admit that there are competing theories 
of method at level II we need some way of adjudicating between them. This 
adjudication may be carried out in two ways. The first, which we may call 
‘the criticism from below’, would be a logical or an epistemological critique 
of the shortcomings of a theory of method. This approach is adopted by 
Popper against naive inductivism. He uses Hume’s argument that induction 
can not be justified to eliminate naive inductivism as a viable theory of 
method; and he develops an epistemological argument to the effect that the 
observational base from which the inductions are allegedly made is not 
available to us.? Elsewhere Popper attempts logical critiques of Bayesianism 
and of varieties of confirmation theory based on probability (see, for 
example, the controversy between Popper and adherents to the Carnapian 
approach to confirmation). It is not my purpose to evaluate these critiques 
of certain level II methodologies; they are mentioned as one way in which 
a number of rival methodologies could be eliminated if the arguments are 
successful. 

What if, after such criticism from below, we are still left with several 
rival methodologies? Here meta-methodology could come into its own—we 
can call it ‘the criticism from above’. Whatever level III meta-methodology 
might be like it would comprise one or more distinct rules or principles for 
adjudicating between rival level II methodologies (whether or not any 
criticism from below has been applied to them—in what follows let us 
assume that criticism from below has done its work but still leaves us with 
rival methodologies). However meta-methodology is not unproblematic. 


! See Popper [1957], especially § 29 entitled “The Unity of Method’, and Popper [1976al], pp. 
, especially Thesis six. 
? These two critiques of induction re-appear throughout Popper’s writings; in particular see 
‘Science: Conjectures and Refutations’ in Popper [1963]. 


448 Robert Nola 


There are three possibilities concerning level III. (i) There are no meta- 
methodological theories. If this were the case we would be left with rival 
theories of scientific method at level II between which there is no method 
for deciding. This raises the possibility of relativism (or pluralism) with 
respect to theories of method.! (ii) There are two or more distinct rival 
meta-methodological theories at level III. If there is proliferation of meta- 
methodologies then either relativism sets in at level III rather than level II 
or one can ascend to level IV at which one finds a single meta-meta- 
methodological theory to adjudicate between meta-methodologies at level 
III. If at some level a unique theory does not emerge then one is threatened 
with either relativism once more or an indefinite regress up the hierarchy of 
levels of methodological theory. (iii) There is a single meta-methodological 
theory at level IJI which enables us to decide between rival methodologies 
at level II. Without arguing for any one of these possibilities let us proceed 
on the assumption that there is at least a legitimate field of meta-metho- 
dology to investigate. 

What a meta-methodological theory is like is, at the moment, unclear. In 
subsequent sections some candidate theories will be considered. Broadly 
they can be classified as either a priori or empirical. Popper’s conventionalist 
approach to meta-methodology in the LSD is the only kind of a priori 
theory that will be investigated in detail. Other a priori theories include 
‘logicism’, the view that rules of scientific method have the same status as 
rules of logic, or transcendentalism, the view that meta-methodology could 
be justified from the bare possibility of science. Empirical approaches 
to meta-methodology are various. Popper canvasses and rejects one such 
approach, viz., naturalism (see section 4). In section 7 another empirical 
approach will be canvassed based on falsificationist meta-methodology. 
However these two do not exhaust the range of empirical accounts of meta- 
methodology. Before these positions are elaborated a brief sketch of some 
of Popper’s level II methodological rules will be given in order to illustrate 
some features of a theory of method and to give a concrete example of 
aspects of at least one level II methodology. 


3 POPPER’S METHODOLOGICAL RULES FOR SCIENCE 


Each level II methodology, M, can be broadly described as having a goal 
G (or set of goals) and a set of rules R which, if followed, will, allegedly, 
reach the goal. We may define a methodology M as the pair <G, R>. 
Alternatively, we could treat the first member of the pair as an overriding 
rule which says, ‘Always aim for goal G.’ On this view methodologies can 
be treated as sets of rules rather than as the pair rules-plus-goal. Either 
view can be adopted in what follows. Popper lays down a number of goals 
for his theory of method not all of which are independent of one another. 


1 See Popper [1976b] for his rejection of relativism and ‘‘the myth of the framework”. 


The Status of Popper’s Theory of Scientific Method 449 


In various places Popper has suggested various goals for science such as: 
falsifiable statements;' true explanatory theories;” true theories with a high 
degree of explanatory power;’ theories with a high degree of verisimilitude* 
(note that this goal is not consistent with the previous two goals since 
verisimilitude is defined as a relation holding between false theories); 
explanatory theories with a high degree of testability (falsifiability).> Goals 
Popper rejects for his theory of method include ‘absolutely certain irrevo- 
cably true statements” and statements with a high degree of probability.’ 
Popper regards the choice of such goals as a ‘proposal for an agreement or 
a convention’,® or as a decision which depends ‘. . . upon the aim which we 
choose from among a number of possible aims [for science]’.? In claiming 
this Popper rejects the view that he is providing ‘the true or essential aims 
of science’! Rather he sees the choice of goals for science as a matter for 
decision, the possibility being left open that different methodologies may 
contain radically different goals for science. Popper also lays down a number 
of rules which are alleged to achieve goal(s) G. However Popper provides 
no proof that these rules do achieve many of the goals listed (though it will 
be evident that many of the rules Popper suggests do lead to theories with 
a high degree of falsifiability since they are specifically designed with that 
goal in mind). Nor does Popper provide a proof that these rules are the 
only ones that achieve the goal(s) (i.e., two or more sets of rules could do 
the job equally well). 

What are these rules? The first is a rule of demarcation. Popper claims 
that a statement (theory, story, etc.) S is scientific (or empirical—Popper 
uses these terms interchangeably on occasions) if and only if it is testable. 
Testability is then identified with falsifiability but not verifiability or prob- 
ability 1! Falsifiability is a purely logico-epistemological property possessed 
by any statement S for which a quantitative measure can be given in the 
interval [o, 1]. Thus we may speak of the degree of falsifiability of any 
statement S and we may write for short, ‘O < D.Fsb.(S) < 1.’ While there 
can be no objection to the notion of degree of falsifiability from a logical 
point of view"? (it turns out to be a measure of the amount of information 


' LSD, 49. 

2? Op. cit. p. 61, footnote *. 

; Popper 1963), p. 229 or Popper Dap p. 191 where Popper suggests that ‘. . . the aim of 
science is to find satisfactory exp 

* Popper [1972], p. 57. 

5 Op. cit. p. 193 and p. 356. 

€ LSD, 37. 

7 LSD, Chapter X and Popper [1983], Part II, Chapter II. 


LSD, 38. 

u For these claims see LSD, sections 4 and 6 of Chapter I. 

12 I pass over the many logical problems that confront Popper’s content measures such as 
degree of falsifiability and empirical content. For an attempt to deal with some of the 
problems see Watkins [1984], chapter 5.1. 


450 Robert Nola 


contained in a statement) there is controversy about whether the domain 
of the scientific is to be identified with the set of statements S such that 
O < D.Fsb.(S) < 1. Popper is aware that this may be controversial so he 
makes it clear that he is making a decision, putting forward a proposal or 
adopting a convention.’ This point is often overlooked by confusing the 
purely logico-epistemological property of degree of falsifiability possessed 
by statements with Popper’s declaration that he is simply going to adopt a 
certain range of the degrees of falsifiability as a mark of the scientific. As 
he puts it: 

My criterion of demarcation will accordingly have to be regarded as a proposal for 
an agreement or convention. As to the suitability of any such convention opinions 
may differ; and a reasonable discussion of these questions is only possible between 
parties having some purpose in common. The choice of that purpose must, of 
course, be ultimately a matter of decision going beyond rational argument (LSD, 37). 


This quotation also raises the problem, to be discussed subsequently, of 
how proposals or conventions can be assessed. What it does emphasise is 
that the demarcation criterion suggested by Popper has the character of a 
declaration or a stipulation. Shortly following this remark Popper talks of 
‘values’: “Thus I freely admit that in arriving at my proposals I have been 
guided, in the last analysis, by value judgements and predilections’ (LSD, 
38). But talk of values here is not to be confused with any ethical norms; 
rather the values are methodological rules or norms for science that Popper 
has decided to adopt. Thus we arrive at the first of our methodological 
rules, viz., the demarcation criterion which serves to ‘define’ science in the 
manner just outlined: 


Rule of Demarcation: Admit into science only those statements S such that 
O < D.Fsb.(S) < 1; reject all else as non-scientific. 


The Supreme Meta-Rule: Closely linked to this rule is Popper’s supreme 
rule which is, as he says, a ‘methodological supplement’ to the demarcation 
rule just given: 

Just as chess might be defined by the rules proper to it, so empirical science may 
be defined by means of its methodological rules. In establishing these rules we may 
proceed systematically. First a supreme rule is laid down which serves as a kind of - 
norm for deciding upon the remaining rules, and which is thus a rule of a higher 
type. It is the rule which says that the other rules of scientific procedure must be 


designed in such a way that they do not protect any statement in science against 
falsification (LSD, 54). 


This supreme rule ensures that the other rules of scientific method are 
constructed so as to ensure that the demarcation criterion can be applied. 
It also leads directly to more specific rules which protect the demarcation 
criterion. 


' See especially sections 4, 6, 9-11 and 19-20 of LSD where the stipulative character of his 
definition of science is made clear. 


The Status of Popper's Theory of Scientific Method 451 


In proposing his demarcation criterion Popper was aware of a number 
of objections that could be brought against it. Prominent amongst them 
were various ways of evading falsification thereby rendering a statement 
unfalsifiable (and therefore non-scientific by the demarcation criterion). 
These evasion tactics include introducing ad hoc auxiliary hypotheses, 
changing ad hoc a definition or refusing to acknowledge falsifying obser- 
vations or experiments. Concerning these objections to the demarcation 
criterion Popper says: 

I must admit the justice of this criticism; but I need not therefore withdraw my 
proposal to adopt falsifiability as a criterion of demarcation. For I am going to 
propose (in sections 2of.) that the empirical method shall be characterized as a 


method that excludes precisely those ways of evading falsification which, as my 
imaginary critic rightly insists, are logically possible (LSD, 42). 


When one looks at section 20 of LSD one finds an account of a rival 
theory of method with goals and rules different from Popper’s. This is a 
level II methodology called ‘conventionalism’ by Popper. Let us call this 
‘methodological conventionalism’ (or ‘m-conventionalism’ for short) in 
order to distinguish it from other things Popper calls ‘conventionalist’. 
Popper is definitely not a supporter of level II m-conventionalism. But, as 
has been seen, he does talk of his demarcation criterion as a convention and 
he does talk of his methodological rules as conventions (LSD, section 11). 
The former kind of conventionalism pertains to the conventional definition 
of the term ‘scientific’ by means of the demarcation criterion (call this ‘d- 
conventionalism’ which is short for ‘definitional conventionalism’). The 
latter treats rules as conventional in the sense that they are neither empirical 
nor analytic claims (call this ‘rule conventionalism’ or ‘r-conventionalism’ 
for short). However, as will be seen, there are strong elements of m- 
conventionalism at level III in Popper’s meta-methodology. 

While Popper marshalls some ‘criticisms from below’ against a naive 
inductivist level II methodology it turns out that he has no such criticisms 
to offer against m-conventionalism: 

I regard conventionalism as a system which is self-contained and defensible. 
Attempts to detect inconsistencies in it are not likely to succeed. Yet in spite of all 


this I find it quite unacceptable. Underlying it is an idea of science, of its aims and 
purposes, which are entirely different from mine (LSD, 80). 


In the light of this Popper simply bans m-conventionalism by incorporating 
in his methodology rules that are contrary to the rules of a level II con- 
ventionalist methodology: 

The only way to avoid conventionalism is by taking a decision: the decision not to 


apply its methods. We decide that if our system is threatened we will never save it 
by any kind of conventionalist stratagem (LSD, 82). 


But surely Popper is employing a conventionalist stratagem in level III 
meta-methodology to get rid of the undesirable level II m-conventionalism. 
In order to preserve his own criterion of demarcation against criticism 


452 Robert Nola 


Popper is obliged to supplement it by methodological rules which simply 
prohibit the use of any methodological rule which would allow criticism to 
impinge directly on the demarcation criterion. This is just one respect 
(more will emerge shortly) in which the Popper of LSD is a conventionalist 
in level III meta-methodology while being an anti-conventionalist in level 
II methodology. 

We can get some idea of the kind of m-conventionalism Popper opposes 
by negating the following three rules which prohibit conventionalist strate- 
gies; whether anyone has adopted such rules as part of m-conventionalism 
need not concern us. The anti-conventionalist rules Popper adopts as part 
of his own level II methodology are: 


Anti-Conventionalist Rule 1, or The‘No Ad-Hoc Hypotheses’ Rule: Ifsome 
theory T conflicts with what we observe and we modify T to T” by adding 


auxiliary hypotheses (and perhaps deleting some component hypotheses of 
T) then we are to accept provisionally T” only if D.Fsb.(T’) > D.Fsb.(T).’ 


Anti-Conventionalist Rule 2: This rule concerns the role of definitions in 
science. Against changes in explicit definitions Popper says that these 
‘... are permissible if useful; but they must be regarded as modifications 
of the system, which thereafter has to be re-examined as if it were new’ 
(LSD, 83). Though Popper does not say it, presumably changes in explicit 
definition will be admitted only if the “new system” has a higher degree of 
falsifiability than the old. Concerning implicit definitions we find an overall 
ban on treating theories as if they were implicit definitions of the theoretical 
terms they contain; in such cases the theories would have zero degree of 
falsifiability. Of this Popper says: ‘Yet this use [of undefined concepts in 
implicit definitions] must inevitably destroy the empirical character of the 
system. This difficulty, I believe, can only be overcome by means of a 
methodological decision. I shall, accordingly, adopt a rule not to use unde- 
fined concepts as if they were implicitly defined’ (LSD, 74-5). In a stroke 
Popper bans a whole range of interpretations of scientific theories not in 
conformity with his demarcation criterion.” 


' This is a rendering of Popper’s rule expressed thus: ‘As regards auxiliary hypotheses we 
propose to lay down the rule that only those are acceptable whose introduction does not 
diminish the degree of falsifiability or testability of the system in question, but, on the 
contrary, ‘increases it? (LSD, 82-3). Whether this is an adequate expression of the rule 
against ad hoc hypotheses need not concern us here. It has in fact been subject to a devastating 
criticism in Griinbaum [1976]; Griinbaum shows that an anti-ad hoc rule of the sort suggested 
by Popper is logically impossible. This is one kind of criticism from below that can be 
directed against one of Popper’s central rules of method. 

One striking consequence of this rule is that it bans the view, championed by Feyerabend, 
Kuhn and others, that theoretical terms, such as ‘mass’, are defined by the theoretical context 
in which they occur, for example, that there is Newtonian mass and relativistic mass and 
that these are, in some sense, incommensurable with one another. Popper’s directive would 
take us in the direction that terms like ‘mass’ are transtheoretical and preserve their reference 
from theory to theory. However issues concerning meaning are underdiscussed by Popper— 
in fact they are generally dismissed as trivial. 


Ë 


The Status of Popper’s Theory of Scientific Method 453 


Anti-Conventionalist Rule 3: So far conventionalist strategies directed at 
saving theories which conflict with observations have been banned by 
Popper. But conventionalists sometimes attack those very observations and 
experiments that come into conflict with firmly held theories rather than 
save the theories. Popper alleges that conventionalists often adopt a sceptical 
attitude to such observations and experiments or reject them outright. To 
prohibit these, and other, saving stratagems at the observational level 
Popper urges the adoption of the rule: ‘Intersubjectively testable experi- 
ments are either to be accepted, or to be rejected in the light of counter- 
experiments’ (LSD, 84). 

In section 11 of the LSD entitled ‘Methodological Rules as Conventions’ 
Popper gives us some more rules of his methodology. One of these urges 
us to keep testing our theories and not lapse into dogmatism with respect 
to our current theories. 


No-Stopping Rule: “The game of science is, in principle, without end. He 
who decides one day that scientific statements do not call for any further 
test, and that they can be regarded as finally verified, retires from the game’ 
(LSD, 53). 


Though this does not have the form of a rule it is easy to transform it into 
one that bids us never to give up testing our theories. 

Curiously enough Popper says of this rule: “Theories which we decide 
not to submit to any further test would no longer be falsifiable’ (LSD, 54). 
But just because we decide not to test a theory it does not thereby lose 
its logico-epistemological property of having some degree of falsifiability. 
Rather it is we as testers of theories who abandon the second-order critical 
tradition with respect to our theories. Popper’s remark reveals how closely 
he weds his conception of science to the application of methodological rules 
to level I theories; if we fail to apply these rules we no longer play the game 
of science and the level I theories cease to be scientific, t.e., falsifiable. If 
we fail to separate the methodological use of the notion of falsifiability 
from its logico-epistemological definition this remark fails to make good 
sense. 


Rules for Acceptance and Rejection of Hypotheses. We need some rules for 
accepting and rejecting hypotheses. What Popper offers in section 11 of 
LSD is sketchy: ‘Once a hypothesis has been proposed and tested, and has 
proved its mettle [t.e., has been corroborated], it may not be allowed to 
drop out without “‘good reason”. A “good reason” may be, for instance: 
replacement of the hypothesis by another which is better testable [#.e., has 
a higher degree of falsification]; or the falsification of one of the conse- 
quences of the hypothesis’ (LSD, 53-4). Elsewhere Popper provides a 
more full account of the rules, and grounds, for accepting and rejecting 
hypotheses (particularly probabilistic hypotheses) and attempts to meet 
objections to his proposals. It is not germane to my purpose to enter into 


454 Robert Nola 


this discussion;' what is relevant is the methodological character of the rules 
Popper proposes. 

Methodological rules can be generated by the manoeuvre of taking level 
I statements which are unfalsifiable, and therefore unscientific, and letting 
them re-enter science as level II methodological rules. Popper treats the 
principle of causality in this way (see LSD, 55-6 and 61). Following Popper 
we can take this principle to be the claim that for every event there is a 
causal explanation. This turns out to be unfalsifiable but a respectable piece 
of metaphysics; so it is not part of science. However it gets back into science 
as a methodological rule which Popper formulates thus: ‘It is the simple 
rule that we are not to abandon the search for universal laws and for a 
coherent theoretical system, nor ever give up our attempts to explain caus- 
ally any kind of event we can describe’ (LSD, 61). Thus methodology can 
take on the burden of metaphysical elements ejected from level I theory. 
In this way phenomenalist interpretations of science are given short shrift 
by methodological proscription: ‘For the requirement of scientific objec- 
tivity can also be interpreted as a methodological rule: the rule that only 
such statements may be introduced in science as are inter-subjectively 
testable’ (LSD, 56). 

More rules can be found elsewhere in Popper’s writings.” Those listed 
above should suffice to illustrate Popper’s own account of method in par- 
ticular and, in general, his conception of the form any theory of method 
should take. Whether all the rules listed are a proper part of scientific 
method (e.g., those listed in the previous paragraph) or whether the rules 
have been adequately formulated are matters which lie outside the scope of 
this paper. 

However we can ask: are the rules outlined above suitable to serve as the 
methodology of CR which, as was pointed out in section 1, is alleged by 


1 Consider, for example, the discussion over Popper’s rejection of induction and his arguments 
in support of non-inductive corroboration. However even the remarks concerning falsi- 
fication need modification. Contrary to what is said above, a hypothesis should not be allowed 
to drop out simply because one of its consequences is false. As Feyerabend, Lakatos and 
others such as Giere [1985], p. 335, have pointed out, on the basis of this formulation of the 
tule for falsification most of our current theories would ‘drop out’ leaving us with no theory 
at all since they each have false consequences. However Popper gives a more full account of 
falsification when he says: ‘If accepted basic statements contradict a theory, then we take 
them as providing sufficient grounds for its falsification only if they corroborate a falsifying 
hypothesis at the same time’ (LSD, 87). False consequences are necessary for the falsification 
of a theory but are not sufficient. This meets only some of the objections of Feyerabend, 
Lakatos and Giere. Subsequently in his ‘Reply to Lakatos’ in Schilpp (ed.) [1974], p. 1009 
Popper distinguished between eliminating a theory as a contender for the truth, i.e., refuting 
it, and abandoning a theory. A theory may well be refuted but not abandoned in the sense 
that even a person who believed a theory to be refuted may continue to work on it; refuted 
theories may reveal hidden true consequences. These two additional points of Popper’s, and 
others that could be made, show that even the rule for falsification is a much more complicated 
matter than is suggested in the text. 

See, for example, Popper’s attempts to lay down methodological rules concerning basic 
statements in LSD sections 22, 23 and 28-30 and the various attempts to formulate rules 
for dealing with statistical hypotheses. 


K 


The Status of Popper’s Theory of Scientific Method 455 


Popper to be the method of both philosophy and science? The answer 
would seem to be ‘no’. In a number of places Popper is at pains to point 
out that most, if not all, philosophical statements have zero degree of 
falsifiability.! But the rules just listed depend crucially on the fact that the 
statements to which they apply must have a degree of falsifiability strictly 
in the open interval (o, 1) which, in particular, excludes the limit o. So, the 
methodology of CR cannot be identical to the rules of method above, in 
particular, the central demarcation rule and its closely associated anti- 
conventionalist rules. One way of preserving these rules as a theory of CR 
would be to provide a measure of the information contained in philo- 
sophical statements other than that of zero.” However this would be a radical 
departure from Popper’s general point of view in which the information 
contained in such statements is identified with their degree of falsifiability. 
In addition this would entail reformulating the demarcation criterion 
in terms of a generalised notion of information content instead of falsifia- 
bility; thus Popper’s rules could not escape reformulation even under this 
proposal. 

More often that not CR is given a quite bland almost contentless charac- 
terisation by Popper in contrast to the richer rules of his theory of scientific 
method. Thus CR may be no more than the general directive to criticise 
our theories,’ or always to attempt to show that our theories are false. As 
Popper puts it in the form of a rhetorical question: ‘Is it possible to examine 
irrefutable philosophical theories critically? If so, what can a critical dis- 
cussion of a theory consist of, if not of attempts to refute the theory?* CR 
then becomes an unobjectionable but rather empty set of rules which at 
best excludes dogmatism but says little more that is positive. If the rules 
are made more rich (like the rules listed above) then they can apply only to 
specific kinds of statement (e.g., those with degree of falsifiability #0). It 


1 Tt has been a cornerstone of Popper’s doctrines since LSD that metaphysical (and therefore 
philosophical) statements, by definition, have zero degree of falsifiability. See also Part 2 of 
‘On the Status of Science and Metaphysics’ in Popper [1963] on the status of philosophical 
claims. On pp. 194-5 Popper lists a number of philosophical theories which he alleges are 
irrefutable (1.¢., unfalsifiable) but are, at the same time, false and yet can be subject to 
criticism. If they are criticizable then the canons of that criticism cannot be the metho- 
dological rules listed above. 

? Some philosophers have attempted to provide a measure of the contentfulness of analytic 
statements; See Hintikka [1973], chapters 6, 7 and 10. 

3 No more than this is said in the passage quoted at the beginning of section 1; see the passage 
from LSD, 16. 

* This is from Part 2 of ‘On the Status of Science and Metaphysics’, p. 198 of Popper [1963]. 
On p. 197 of this paper Popper lists three types of theory which can be subject to criticism. 
These are logical/mathematical, empirical/scientific and philosophical/mathematical. At best 
the rules of method outlined in this section can apply only to the second kind of theory and 
not to the other two kinds. Popper insists that we endeavour to distinguish true from false 
claims in each kind of theory by subjecting them to criticism t.e., attempts at refutation. 
However one needs more than the directive ‘attempt to refute’ for a substantive account of 
CR. When CR is made more substantive, as in the case of the rules described in this section, 
then CR may well cease to apply to all three kinds of theory. 


456 Robert Nola 


seems that we must abandon the claim that there is any significant ‘‘unity 
of method” in areas as disparate as philosophy and science beyond the 
blandest of methodological directives such as ‘attempt refutations!’. From 
here on Popper’s rules will be treated narrowly as his rules of scientific 
method alone. For the sake of a convenient label they will be referred to 
collectively as ‘methodological falsificationism’, or ‘m-falsificationism’ for 
short. 


4 POPPER’S ANTI-NATURALIST VIEW OF METHODOLOGY 


To the positivist critics of methodology who said that a theory of method 
which was not a part of logic nor an empirical science must be nonsense, 
the Popper of LSD’ replied that there was a third alternative viz., metho- 
dologies are sets of rules and these rules are conventions which we can 
decide to adopt or not adopt. A thorough-going investigation of what status 
theories of method may have is outside the scope of this paper. As suggested 
in section 2, they may be classified broadly as either a priori or empirical. 
As will be seen, Popper rejects one a priori interpretation of methodologies; 
they do not have the same status as rules of logic (assuming that these are 
a priori). But he opts, in the LSD, for another a priori interpretation, viz., 
methodologies are conventions. (This is the topic of the next section.) 
Popper also rejects the view that methodologies can be empirical. By this 
Popper means that methodologies are not like the empirical sciences; this 
view he labels ‘naturalistic’ which he defines as ‘a study of the actual 
behaviour of scientists or of the actual procedure of science’ (LSD, 52). 
This is, presumably, an historical, psychological and sociological study of 
scientists and science both past and present. This is something in which, 
as Popper admits, a methodologist may well take an interest, but not qua 
methodologist. Popper’s arguments against naturalism are the topic of this 
section. However it remains to be seen whether or not there is yet another 
sense in which methodologies might be empirical yet not naturalistic in the 
sense defined here; this anti-naturalistic yet empirical interpretation of 
Popper’s methodology is the topic of section 7. 

Obviously, if we treat Popper’s methodological rules as prescriptions, 
and this is clearly the case from the way they were formulated in the 
previous section, then they cannot be empirical claims. Nor can these rules 
be deduced from, nor do they entail, any empirical claims, for good Humean 
reasons.” Thus there is a logical gulf between the prescriptive character of 
methodological rules and any historical/psychological/sociological study of 
the behaviour of scientists and the procedures of science. How, then, is it 
possible to treat methodologies naturalistically as an empirical science in 
the first place? For each rule of method, R, it is possible to construct a 


' For this charge see LSD, 51, footnote *1. 
? That Hume’s is/ought distinction applies in methodology of science as well as ethics is 
vigorously defended in Musgrave [1983]. 


The Status of Popper’s Theory of Scientific Method 457 


counterpart descriptive claim, D, which, we can assume, describes an ideal 
world in which the behaviour of all scientists is in conformity with rule R. 
Thus consider the demarcation rule of the previous section. We can con- 
struct a counterpart descriptive claim which in an ideal world would be true 
of all scientists since they were good m-falsificationists: all scientists admit 
into science only those statements with degree of falsifiability other than o 
or 1 and they reject the rest. Is our world the ideal one or not? Only the 
historical/psychological/sociological study can tell. 

Popper offers two considerations against naturalism in the two following 
sentences: 


I do not believe that it is possible to decide, by using the methods of an empirical 
science, such controversial questions as whether science actually uses a principle of 
induction or not. And my doubts increase when I remember that what is to be 
called a ‘science’ and who is to be called a ‘scientist’ must always remain a matter 
of convention or decision (LSD, 52). 


Concerning the first consideration Popper gives no argument for why he 
thinks that it is not possible to decide methodological matters empirically. 
However an important argument does lie behind Popper’s claim that can 
be set out explicitly. Let R( = R,, R3, . . . Rn) be some set of rules of method 
and let D be the corresponding set of counterpart descriptive claims which 
are empirical in character and therefore comprise the naturalistic science 
of method. Let us test D by data culled from a historical/psychological/ 
sociological investigation of past and present science and scientists. In 
order to test D there must be some criteria which determine what data 
are or are not relevant to the test of D and how the relevant data bear on 
D. These criteria will be given by some subset (proper or not) of the set 
of rules R. Choose some R; from this set. Consider now its descriptive 
counterpart D;. The test of D; presupposes the acceptability of R; directly; 
or indirectly in the sense that the test of D; may involve rules other than R; 
but these rules in turn have descriptive counterparts which are tested using 
rule R,;— otherwise rule R; plays no genuine role in the methodology. Thus 
there is at least one rule, viz., R;, that cannot be established empirically.' 
This seems to be the argument which backs Popper’s claim that it is 
impossible to establish rules of method empirically. It is a strong argument, 
as strong as some of the classical objections to the possibility of justifying 
induction on the grounds that the justification presupposes induction.” In 
fact this is how Popper expresses his objection to a naturalisic methodology 
above; whether science uses a rule of induction cannot be determined 
empirically because that very empirical determination must be established 
by the rule of induction itself. Similarly, we could argue against a natur- 


1 An argument similar to this is given in Giere [1985], p. 333 and is called by him ‘the circle 
argument’. 

2 It is precisely this kind of objection to induction which Popper endorses in the first section 
of LSD. 


458 Robert Nola 


alistic view of the rules of ordinary reason that are formulated in the 
sentential and first-order calculi. How people do reason has no bearing on 
the validity of these rules; but, in so far as we attempt to establish regularities 
about how humans do in fact reason, some of these rules (e.g., Modus 
Tollens) may well play a crucial role in determining what are the regularities 
concerning human reasoning behaviour (even if, for example, one of the 
regularities is that they break the rule Modus Tollens). 

Popper’s second consideration against a naturalistic methodology turns 
on who is to count as a scientist and what is to count as science. Consider 
the descriptive counterpart to the rule of demarcation. It makes reference 
to all scientists. But who are they? We need criteria to distinguish scientists 
from a host of people from pundits to prophets. If we pick scientists out as 
those who at least deal only with statements with a degree of falsifiability 
between o and 1, then the descriptive counterpart to the rule of demarcation 
becomes analytically true and is not empirical. For the descriptive counter- 
part to be genuinely empirical, criteria of individuation for scientists must 
be specified which are independent of any theory of method. The same 
goes for criteria for individuating what is to count as science in a genuinely 
naturalistic account of method. Popper’s own view is that what is to count 
as science and who are to count as scientists are to be specified in terms of 
his theory of method which, in turn, is to be given a conventional definition. 
Thus Popper’s second consideration against a naturalistic methodology 
turns on his view of the status of theories of method. As such his second 
consideration against naturalism does not have the same force as his first 
consideration. But it does raise the significant question of how science and 
scientists are to be individuated and replies that this can only be done with 
reference to (aspects of) a theory of method. 

While these considerations provide a case against a naturalistic account 
of methodologies they only do so in the narrow sense of naturalism so 
defined. There may yet be considerations that bear on methodologies which 
are of an empirical character yet are not naturalistic. Popper’s anti-natur- 
alistic argument leaves methodologies untouched by the actual historical 
development of science and, in particular, scientists’ actual appraisals of 
their own theories. However if methodologies are proposed in a manner 
not entirely divorced from actual science then to that extent they could be 
said to be empirically based. One way in which Popper attempts to give his 
theory of method an empirical flavour will be discussed in section 7. We 
will now turn to Popper’s highly a priortstic account of method in LSD in 
which methodology is untrammelled by actual science. 


5 POPPER’S CONVENTIONALIST META-METHODOLOGICAL 
CRITIQUE OF METHODOLOGIES 


In escaping from the positivist trichotomy of analytic/empirical/nonsense 
Popper opted for the view that methodologies were conventions. Thus they 
are meaningful and a priori in character (in the sense that they are accepted 


The Status of Popper’s Theory of Scientific Method 459 


on grounds other than those based on experience) rather than empirical; 
but they are not analytic—in particular, the rules of method do not have 
the same status as rules of logic. This Popper claims in the following 
passage: 


Methodological rules are here regarded as conventions. They might be described as 
the rules of the game of empirical science. They differ from the rules of pure logic 
rather as do the rules of chess, which few would regard as part of pure logic: seeing 
that the rules of pure logic govern transformations of linguistic formulae, the result 
of an inquiry into the rules of chess could perhaps be entitled “The Logic of Chess’, 
but hardly ‘Logic’ pure and simple. (Similarly, the result of an inquiry into the 
rules of the game of science—that is, of scientific discovery—may be entitled “The 
Logic of Scientific Discovery’) (LSD, 53). 


One can add further reasons why the methodological rules of section 3 are 
hardly like the rules of pure logic, e.g., disjunctive syllogism. They are not 
schematic and do not have places for variables; they contain descriptive 
terms as well as logical constants; if they are to be validated then this cannot 
be carried out in the usual model-theoretic fashion of, e.g., quantification 
theory; and so on. Nor are the rules warranted by virtue of the meaning 
of their constitutive terms; so they are not analytic. Since such rules 
are significant the Popper of LSD is left with the option that they are 
conventions. 

The passage just quoted is significant in a number of respects not least 
of which is the (somewhat misleading) choice of the title of the book to 
stand for an inquiry into the rules of scientific methodology (understood in 
the sense of the justification or the test of theories rather than their invention 
or discovery, a distinction which Popper is at pains to make in LSD (sections 
1 and 2) in his criticism of psychologism in science). In what sense are 
methodological rules conventions? First, they lack a truth-value. In reject- 
ing the positivist trichotomy Popper wanted to leave room for philosophical 
or metaphysical claims which were neither analytic nor empirical nor non- 
sense yet had a truth-value and were amenable to critical investigation. 
Conventions differ from such claims only in lacking a truth-value. If we 
wish to attribute truth-values to methodological rules or to claim that, in 
some sense, these rules can be validated or invalidated then one had better 
give up the view that they are conventions. This alternative will be taken 
up in the next two sections. 

Second, Popper emphasises the role of decisions in setting up conven- 
tions. But these decisions are not those which result from deliberations or 
reasoning. Rather there is a proposal which we must decide either to adopt 
or not adopt unconstrained by considerations which in any case, if given 
credence, underdetermine the final outcome. If there were a truth to be 
discovered about which rule to adopt then that would guide us in our 
considerations. But in the nature of the case the conventions adopted lack 
a truth-value. Such a “‘decisionist’’ view of methodological rules marks a 
quite significant way in which they differ from rules of logic ee oa oS 


-, 2 


na 


460 Robert Nola 


comparison suggested in the quotation above. Rules of logic are clearly not 
conventions in that their acceptance is not based on a decision; nor are they 
‘proposals for an agreement’. They have truth-values and can be shown to 
be valid or invalid. 

Third, the comparison with the rules of chess suggests another feature 
of conventionality—there are equally viable alternative decisions that we 
could have made but did not. Chess is governed by a closely circumscribed 
set of rules which admit of no, or only slight, variation. Yet we can envisage 
alternative rules for moving pieces on a board and thereby play alternative 
games just as we can envisage alternative rules for our driving habits or 
rules associated with traffic light colours. (Consider the, probably apocry- 
phal, story in which rugby resulted from some soccer players deciding to 
run with the ball.) Given that there are alternatives, the current rules become 
constitutive of chess; adopt different rules and you are not playing chess. 
The rules of chess (call these ‘C’) provide a definition by convention for 
the word ‘chess’. Thus the claim ‘chess is played according to C’ is ana- 
lytically true. However the rules C are neither true nor false in any sig- 
nificant sense; they are merely the rules we choose to set up and follow 
when we play what we call ‘chess’. If we push the comparison between 
rules of chess and methodological rules this far then the latter, too, must 
admit of equally viable alternatives which we could have adopted and for 
which the question about their truth-value simply does not arise. 

Some of the above points of comparison hold, others do not, for metho- 
dological rules. Popper admits that there are alternative rules to his three 
anti-conventionalist rules of m-falsificationism that could be adopted. How- 
ever the decision to adopt either is not entirely an unconsidered matter as 
in the case of setting up the rules of a game like chess. A conventionalist 
has a goal for science, G(conv.), which is distinct from Popper’s, G(fals.), 
and each has distinct rules, R(conv.) and R(fals.), which are alleged to be 
instrumental in achieving their respective goals. The choice of each goal 
may well be ‘. . . ultimately a matter of decision, going beyond rational 
argument’ (LSD, 37). But this conventional aspect of a methodology infects 
only the goal chosen and not the rules which are instrumental in achieving 
the goal. If it turns out that for some goal G there are rival sets of rules 
Ry, Ro,..., Ra for achieving G, then G underdetermines the rules and fur- 
ther decisions are required to pick some set of rules R;. Only then is the 
methodology <G, R;> riddled with seemingly arbitrary decisions. Thus, 
depending on the number of unconstrained decisions we have to make, 
methodologies may well vary in the number of their conventional elements. 

In admitting that there can be alternative methodologies, such as 
<G(conv.), R(conv.)> and ¢G(fals.), R(fals.}>, it follows that the game of 
science, unlike the game of chess, can be run according to alternative rules. 
The methodological rules are not constitutive of science in the same way 
as the rules of chess are constitutive of chess. If there are genuine metho- 
dological rivals then we need some way of individuating science in order to 


The Status of Popper’s Theory of Scientific Method 461 


claim that these are rival rules for the game of science. This more relaxed 
view which permits methodological pluralism is hinted at in the LSD but 
is not permitted to flower. The rivals to m-falsificationism are ruled out by 
making further decisions. In ruling them out Popper proposes that the term 
‘science’ be used in a special way; he stipulates that the term be used for 
whatever is in accordance with the rules of m-falsificationism. Here we 
must distinguish kinds of convention. In proposing a definition for a term 
such as ‘science’ Popper is giving a definition by convention; call this a ‘d- 
convention’ for short. This is a linguistic rule embodying a decision to use 
a word in a certain way. The methodological rules of section 3 are alleged 
to be conventions but they are clearly not linguistic rules. Call these ‘rule- 
conventions,’ or ‘r-conventions’ for short, where it is understood that these 
are non-linguistic rules that we have decided to adopt. Popper’s d-con- 
vention for the term ‘science’ is simply another ‘proposal for an agreement’ 
(LSD, 37)! which in turn depends on the adoption of certain r-conventions. 
In this way the rules of a particular methodology become constitutive of 
science and the analogy with chess holds—but at the cost of explicitly 
barring any alternative set of rules whereby the game of science might be 
played. : 

Such a view of the definition of ‘science’ (and other terms as well) under- 
lines Popper’s anti-essentialist account of definition. The term ‘science’ 
receives its definition not as a result of any investigation into science. 
Such an investigation would be either empirical and therefore unacceptably 
naturalistic or would be too a priori if it were to claim that the essential 
nature of science had been uncovered. Rather a definition of science is 
proposed in the spirit of a non-empirical conjecture in the hope that a 
general agreement can be reached about its adoption. 

Is Popper’s theory of method no more than a ‘“‘castle in the air” with no 
secure foundation in anything except quite insubstantial decisions, pro- 
posals and predilections? Not necessarily; one so far unexplored path would 
be to investigate the extent to which Popper’s conventionalist decisions are 
amenable to investigation in terms of decision theory. However in the 
absence of such an investigation it may seem that Popper’s m-falsi- 
ficationism has been concocted out of a number of r- and d-conventions 
which, since we could make decisions to adopt them or not as the case may 
be, are arbitrary and even irrationally based.” Such a view is an incomplete 
picture of Popper’s position. At best, perhaps only the goal, G(fals.), is up 
for decision since the rules, R(fals.), are the only, or perhaps the best, way 


' Popper's second consideration against a naturalistic theory of method is simply a reiteration 
of his view that the definition of ‘science’ and ‘scientist’ is a matter of convention or decision. 
This also extends to his definition of ‘philosophy’; ‘A definition of the word “philosophy” 
can only have the character of a convention, of an agreement’ (LSD, 19). 

? See Stroker [1984] who considers the conflict between Popper’s espousal of an overall 
critical rationaliam with the seemingly irrational founding of critical rationalism upon mere 
decisions. 


462 Robert Nola 


to achieve the goal; at worst the goal and the rules are each up for separate 
decisions. In any case Popper suggests ways of assessing his proposed r- 
and d-conventions, t.e., there is a level III meta-methodology for assessing 
rival theories of method understood as systems of conventional rules. 
But it turns out that the meta-methodology is more a version of level II 
m-conventionalism elevated to level III. If there is a significant theory of 
CR in LSD beyond the blandest of directives it turns out to be more akin 
to m-conventionalism than m-falsificationism. To see that this is so let 
us investigate the procedure Popper recommends in LSD for adjudicating 
between the three rival theories of method he distinguishes, naive induc- 
tivism, m-conventionalism and m-falsificationism (assuming that we can 
set aside any logical or epistemological criticisms from below of the rules 
of these rival methodologies). 

The cornerstone of Popper’s m-falsificationism is the role that experience 
plays in the rejection of theories or their tentative acceptance; either obser- 
vation reports provide knock-outs for theories (they falsify theories) or the 
theory escapes, for the time being, what might have seemed a knock-out 
(they corroborate theories). In contrast m-conventionalism provides no 
such role for evidential considerations; its criteria for theory acceptance and 
rejection are largely non-evidential. Now if we move up to level III meta- 
methodology it appears that, unless we go to unusual lengths to provide 
them, such evidential considerations of the sort that might apply in using a 
level II methodology to adjudicate between rival level I theories can play 
no role in the use of level III meta-methodology to adjudicate between 
rival methodologies at level II. This suggests that meta-methodological 
principles cannot straightforwardly be level III versions of level II m- 
falsificationist rules. In the LSD Popper seems to recognise this point 
because all his criteria for adjudicating between rival level II methodologies 
are non-evidential in the sense above, t.e., the criteria do not depend on the 
falsification of methodologies or their tentative acceptance on the basis of 
failed attempts to falsify them. At least three passages in the LSD will be 
discussed which mention level [II meta-rules for adjudicating between 
level II methodologies. It will turn out that in these passages level IIT 
meta-rules are proposed which are strongly non-evidential in character 
in that they concern such matters as parsimony, consistency and power 
to solve problems. Thus Popper turns out to be, in level II methodology, 
strongly anti-conventionalist while in level III meta-methodology he is 
conventionalist. 

The first piece of evidence for the claim that Popper is an m-con- 
ventionalist in meta-methodology comes from a passage which follows 
Popper’s dismissal of naturalism as a way of deciding what are the methods 
of science. He says: 

I believe that questions of this kind [ż.e., how do we decide what are the methods 


of science] should be treated in a different way [from naturalism]. For example, we 
may consider and compare two different systems of methodological rules; one with, 


The Status of Popper’s Theory of Scientific Method 463 
and one without, a principle of induction. And we may then examine whether such 
a principle, once introduced, can be applied without giving rise to inconsistencies; 
whether it helps us; and whether we really need it. It is this type of inquiry which 
leads me to dispense with the principle of induction: not because such a principle 
is as a matter of fact never used in science, but because I think that it is not needed; 
that it does not help us; and that it even gives rise to inconsistencies (LSD, 52-3). 


The general tenor of this passage seems to be that of recommending an 
m-conventionalist meta-methodological rule of parsimony, vig., ‘Do not 
multiply rules of method unnecessarily—if you can get by without a rule 
because you do not need to use it or it does not help you, give it up.’ Such 
a principle of parsimony could apply within a methodology to reduce the 
number of rules to the strict minimum necessary to evaluate level I theories. 
More significantly, it could be used to adjudicate between level II metho- 
dologies on the grounds of the number of rules each employs. However, if 
such a meta-criterion is to work satisfactorily each methodology needs to 
be formulated more precisely and completely than they have been in order 
to judge, on the basis of the number or the simplicity of the rules they 
employ, which is the more parsimonious. In this respect most formulations 
of methodologies are no better than collections of rules of thumb (including 
that offered in section 3 as an account of m-falsificationism) and therefore 
cannot be adequately judged for parsimony of rules. A classic example of 
the rejection of a methodology on the basis of superfluity of rules is Popper’s 
rejection of naive inductivism; his ‘deductivist’ account of methodology 
allegedly gets by, assuming all else equal, without a rule of induction (see 
LSD, section 3). 

Popper appeals to another meta-methodological rule in talk of rejecting 
rules which ‘give rise to inconsistencies’. This can be understood in two 
ways. First, we are bid not to adopt inconsistent level II methodologies; 
second, we are bid not to adopt level II methodologies which give rise to 
inconsistent appraisals of level I theories. Whether or not inductivism is 
inconsistent in either of these ways, as Popper charges, need not concern 
us. What is clear is that there is some meta-methodological requirement of 
consistency which most methodologists would support (except perhaps 
those who adopt dialectical or paraconsistent logics—but even here con- 
sistency principles are adopted at higher meta-levels). However such a 
meta-requirement may be too weak since there may be several consistent 
rival methodologies. Popper admits as much when he says of the rival m- 
conventionalism: ‘Attempts to detect inconsistencies in it are not likely to 
succeed’ (LSD, 80). 

We need some help in deciding what Popper means in the above quotation 
by requiring that a rule ‘help us’. This could be understood in the light of 
the above parsimony requirement that the rule not be an otiose part of a 
methodology. But there is a second passage from the LSD which suggests 
that the help methodological rules provide is that of solving problems in 
the realm of the theory of knowledge: 


464 Robert Nola 


There is only one way, as far as I can see, of arguing rationally in support of my 
proposals. This is to analyse their logical consequences: to point out their fer- 
tility—their power to elucidate the problems in the theory of knowledge (LSD, 
38). 


It may be too simple to say that the problem-solving power of meth- 
odologies arises from an analysis of their logical consequences; however, 
how they achieve their solutiuons is not a matter we need enter into. What 
lies behind Popper’s remark is the following analogy: just as level I theories 
provide solutions to problems in the empirical domain and there are level 
II methodological rules which can be used to adjudicate on the comparative 
merits of the problem-solving powers of level I theories (providing a suit- 
able measure of their problem solving power is available), so, analogously, 
level II methodological rules are envisaged as providing solutions in a 
conceptual domain (the theory of knowledge) and there is a level III meta- 
methodological rule which can adjudicate on the comparative merits of the 
problem solving powers of level II methodologies (providing a suitable 
measure of their problem solving power is available). Granting this analogy, 
the problem that now needs to be considered is this: do either of m- 
falsificationism or m-conventionalism have a rule which when elevated to 
level III can be used to adjudicate between the problem-solving powers of 
the rival methodologies? So far neither methodology has been spelled out 
sufficiently fully to determine whether they do have such a rule at level II 
(and this despite the list of rules given in section 3 for m-falsificationism). 
It may be the case that both methodologies share such a rule (in fact this 
will be maintained below). For we cannot argue that their respective sets 
of rules should be mutually exclusive. Already we have envisaged that they 
have in common some consistency rule: they may share other meta-rules 
as well. Further, an answer to the above question may be bedevilled by 
exactly what are the problems to be solved in the theory of knowledge, what 
criteria there are for providing successful solutions to problems, and even 
what are the identity conditions for distinct problems in the theory of 
knowledge so that we may determine in some quantitative or comparative 
way how the methodologies may be gauged for their problem-solving 
power. Let us consider for each methodology some level II rule about the 
problem-solving power of level I scientific theories and what corresponding 
level III rule there may be. 

Since m-conventionalists characteristically employ non-evidential cri- 
teria for deciding between level I theories, such as simplicity, elegance, 
explanatory power, and so on, then there is no reason why they should not 
also be able to provide a rule for adjudicating between these theories based 
on a measure of the problem-solving power of each, or a comparative 
measure between a pair of theories. Thus consider the comparative case for 
a pair of theories T; and T, and the problems that can legitimately be posed 
for each and the solutions they offer. We could say that T; solves more 
problems than T, under at least the following conditions: (i) T, provides a 


The Status of Popper’s Theory of Scientific Method 465 


solution to a legitimate problem while T, provides none; (ii) T; provides a 
more accurate solution to a problem than the solution T, provides; (iii) T; 
provides a correct solution to a problem while T, provides an incorrect 
solution; and so on. Given such a sketch for part of a comparative measure 
for problem-solving, m-conventionalists can easily promulgate a rule for 
theory acceptance and rejection based on it. There is no obvious objection 
to such a rule being elevated to a level III meta-methodological rule except 
that the rule will suffer a diminution in precision. Assuming we can satis- 
factorily individuate problems in the theory of knowledge and agree on 
criteria for their solution then we may still lack a comparison of, say, how 
much more accurately one methodology solves problems than another. 
Moreover it should be noted that whereas level I theories solve problems 
of an empirical nature, level II methodologies are required to provide 
solutions to problems of a non-empirical or conceptual nature. Since it is 
notorious that criteria for successful problem-solving of the latter sort are 
either imprecise unless well formalised or, at worst, are not agreed to, then 
the presumption that a solution to a problem has been found may be 
doubtful and thus the level III meta-methodological rule will suffer from 
increased vagueness or lack of applicability. However, despite these diffi- 
culties, some meta-methodological conventionalist rule based on measures 
of problem-solving does seem possible and even limitedly viable to adjudi- 
cate between level II methodologies. 

Is there a meta-methodological rule that m-falsificationists could call 
their own for adjudicating between methodologies as to their problem- 
solving powers, or does it turn out to be no different from the con- 
ventionalist level III rule just mentioned? It seems that the latter is more 
likely than the former though the issue is not a clear one. Though it was 
not mentioned in section 3, problem-solving looms large in Popper’s view 
of science and its methodology. There are traces of it in LSD, but it has 
come more fully into its own in subsequent works.' It is best illustrated in 
the oft-repeated schema: P, ~ TT > EE > P,, where P; is the initial prob- 
lem to be solved, TT is a tentative theory offered as a solution, EE is the 
process of error elimination practised on TT and P, is a new problem 
bequeathed to us when TT inevitably succumbs to the process of error 
elimination.? But is this schema accompanied by a distinctive metho- 
dological rule based on measures of problem-solving power, or can it be 
understood in terms of already familiar rules of m-falsificationism? In 
discussing this schema Popper declares: “Thus we may say that we can 
gauge the progress we have made by comparing P, with some of our later 


' See the index of Popper [1963] under ‘science progresses from problems to problems’ for 
its appearance in the papers collected in that volume. Popper [1972] contains the major 
papers concerning the theme of problem-solving. 

? For an account of this quasi ‘dialectical’ or evolutionary process involved in problem-solving 
see § 29 of ‘Karl Popper: Intellectual Autobiography’ in Schilpp [ed.] (1974). 


466 Robert Nola 


problems (P,, say).”’ This suggests that if P, does not contain some of the 
problems of P, then progress has been made in problem-solving. Such a 
rule for the selection of some TT; based on a comparative measure of 
problem-solving powers seems to differ in no way from the relevant rule of 
m-conventionalism. 

However the matter may not be as straightforward as this. Popper regards 
the above schema as one way of interpreting the history of science, the 
example he offers being that of Galileo’s proposed solution to the problem 
of the tides (which, as it turned out, was an unsuccessful solution). On his 
proposed analysis P, turns out to be a complex problem situation which is a 
rational reconstruction of the background Copernican theoretical frame- 
work in which the more specific problem of why there is tidal movement 
was set. If the P; denote problem situations rather than specific problems to 
be solved (e.g., why is there tidal movement?) then there do not seem to 
be discretely individuable problems with their proposed solutions which 
should enable us to employ a rule for adjudicating between theories as to 
their problem-solving powers. In that case we must fall back on the various 
TT; and the EE process and consider in what way the various TT; can be 
assessed. This turns out to be none other than the familiar rules concerning 
falsification and corroboration that Popper has advocated since LSD, with 
an element of verisimilitude also thrown in. Such Popperian rules for 
assessing the TT; on the face of it have little to do with problem-solving. 
Perhaps one could tie some notion of problem-solving to Popperian notions 
of content, corroboration and/or verisimilitude in much the same way as 
Popper cashed out in LSD the notion of simplicity in terms of degree of 
falsifiability. However this would merely be a façon de parler in which a 
rule about problem-solving would merely be some other rule or rules of 
characteristically m-falsificationist theory appraisal. 

Let us grant such a façon de parler rule at level II. What would it be like to 
elevate such a rule to level III to adjudicate between level II methodologies? 
Since most of Popper’s measures such as degree of falsifiability, content, 
corroboration and verisimilitude have been defined for only empirical 
theories there is no clear sense in which they could be applied to non-em- 
pirical or conceptual theories. As Popper has emphasised, all these theories 
have zero content. Moreover, they are not amenable to corroboration in the 
sense of having evidence in their favour which provides a severe test for 
them. Nor is it easy to see how verisimilitude measures in terms of truth 
and falsity content could apply to them. Thus there seems to be no sig- 


1 Popper [1972], p. 165. See also p. 288 where problem-solving comes in degrees; we may 
have to ask, ‘How well does our theory solve its problems, that is, P,?’ In addition Popper 
is concerned to point out that any solution to a problem should also lead to ‘newly emerging 
problems different from the old ones’. If we take this seriously then we must not only consider 
in our measure solutions to old problems but also newly emerging unsolved problems! 

2 See Popper [1972], chapter 4, ‘On the Theory of the Objective Mind’, sections 5 to 10 
inclusive. 


The Status of Popper’s Theory of Sctentific Method 467 


nificant façon de parler rule at level III for comparing level II methodologies 
that corresponds to the characteristic rules of m-falsificationism for 
adjudicating between level I theories. 

The upshot of the above discussion seems to be that there is a level III 
meta-methodological rule that both m-conventionalists and m-falsifica- 
tionists could use to adjudicate between rival level II methodologies to 
see how they compare in their power to solve problems in the theory of 
knowlege. However the rule is more characteristic of m-conventionalism 
than m-falsificationism. None of the characteristic critical rules of the latter 
methodology involving falsification or corroboration seem applicable at 
level III to deal with level II methodological theories. 

The third, and final, piece of evidence comes from Popper’s conven- 
tionalist view of definitions: 

My only reason for proposing my criterion of demarcation is that it is fruitful: that 
a great many points can be clarified and explained with its help. ‘Definitions are 
dogmas; only the conclusions drawn from them can afford us any new insight’, says 
Menger. This is certainly true of the definition of the concept ‘science’. It is 
only from the consequences of my definition of empirical science, and from the 
methodological decisions which depend upon this definition, that the scientist will 
be able to see how far it conforms to his intuitive idea of the goal of his endeavours. 

The philosopher too will accept my definition as useful only if he can accept its 
consequences. We must satisfy him that these consequences enable us to detect 
inconsistencies and inadequacies in older theories of knowledge, and to trace these 
back to the fundamental assumptions and conventions from which they spring. But 
we must also satisfy him that our own proposals are not threatened by the same 
kind of difficulties. This method of detecting and resolving contradictions is applied 
also within science itself, but it is of particular importance in the theory of knowl- 
edge. It is by this method, if by any, that methodological conventions might be 
justified, and might prove their value (LSD, 55). 


By what kind of level III rule does Popper ask a philosopher to assess 
his r-conventions of method and the d-convention for the term ‘science’ 
based on these? Asking the philosopher to accept his definition ‘only if he 
can accept its consequences’ is either to ask the philosopher to commit the 
fallacy of affirming the consequent or, somewhat surprisingly, to ask the 
philosopher to accept his view on the basis of the inductive support it 
receives from accepted consequences. The other suggestions we have met 
before. They involve some meta-methodological rule either concerning 
consistency or problem-solving power (in that they invite us to consider 
matters such as ‘inadequacies in theories of knowledge’, or the extent to 
which they ‘clarify points’, are ‘fruitful’ or provide ‘new insight’). And how 
is the scientist to judge Popper’s conventions? By seeing whether the goal 
G of the methodology <G, R) is a goal the scientist can accept. But this 
hardly answers our question about what level II] meta~methodological 
rules we should use to judge rival level II methodologies; either this is a 
purely level II matter about goals for science or it simply raises over again 
the question about what level III meta-rules we should adopt. However 
perhaps it contains the germ of an idea which we will look at in section 7. 


468 Robert Nola 


It turns out, then, that Popper’s level III meta-methodological rules 
advocated in LSD for assessing rival level II methodologies concern con- 
sistency, parsimony and problem-solving power.' Either these meta-rules 
would be maintained by most methodologists (e.g., the rule of consistency) 
or they are characteristic of m-conventionalism elevated to level III. They 
are uncharacteristic of m-falsificationism in that they lack the hard-hitting 
power of rules which appeal to evidential considerations, and must, it 
seems, necessarily suffer this lack. It turns out that Popper is a hard-line 
m-falsificationist when it comes to level II theories of method but an m- 
conventionalist when it comes to level III meta-methodological rules for 
deciding between rivals at level II. If there is an over-arching methodology 
of CR applicable to philosophy then it has strongly conventionalist aspects. 
Given that he simply bans m-conventionalism at level IJ it may seem odd 
to adopt it at level III. But there is nothing inconsistent in this; each 
methodology is restricted to its own approriate level. However it does 
weaken the case for an exclusive level II methodology of m-falsificationism. 

There is a deeper reason why Popper adopted an account of methodology 
as a set of d- and r-conventions and proposed a conventionalist meta- 
methodology for dealing with rival accounts of method. Popper has always 
been a scientific realist but he has never been able to show that his rules of 
method are guaranteed to yield truths about the world. At the beginning 
of section 3 a theory of method was defined as the pair (G, R? and a list of 
the various goals Popper has proposed for science were set out. If the goal 
of science is to arrive at theories with a high degree of falsifiability (which 
Popper has argued is the same as high informativeness or low prior proba- 
bility) then the rules of method listed in section 3 are designed to promote 
that goal. But why adopt such a goal? Moreover, Popper’s critics have 
insisted that we do not aim at theories with low prior probability but rather 
high posterior probability given some evidence E which supports the theory. 
Popper’s response is to appeal to his notion of non-probabilistic corrob- 
oration, i.e., we aim at theories which have survived strongly mounted 
attacks to show them false and which are at the same time highly informative 


l! One can find in subsequent works by Popper conventionalist rules for dealing with non- 
empirical theories such as philosophies or methodologies: ‘Now if we look upon a theory as 
a proposed solution to a set of problems, then the theory immediately lends itself to critical 
discussion—even if it ia non-empirical and irrefutable. For we can now ask questions such 
as, Does it solve the problem? Does it solve it better than other theories? Has it perhaps 
merely shifted the problem? Is the solution simple? Is it fruitful? Does it perhaps contradict 
other philosophical theories needed for solving other problems?’ (Popper [1963], p. 199). 
Most of these questions require an answer based on conventionalist rules of method of the 
sort discussed above; the last question suggests a coherence meta-methodological rule. The 
passage occurs in the essay ‘On the Status of Science and Metaphysics’ in which Popper 
makes an attempt to discuss his critical method as applied to non-empirical theories. In his 
reply to Lakatos in Schilpp (ed.) [1974], p. 1010, we are referred to this essay for a discussion 
of ‘how one can criticize and reject a philosophical theory’. On the same page it is also 
emphasized that Popper’s theory of method is not a falsifiable theory: from this we can con- 
clude that the rules of m-falsificationism cannot in general apply to philosophical theories. 


The Status of Popper’s Theory of Sctentific Method 469 


and therefore still very likely false. But what is so great about theories which 
are bold, have survived tests but which are increasingly unlikely? Scientific 
inquiry conducted according to Popperian rules and aims seems to yield 
nothing positive about the world that scientists are allegedly investigating. 
Since the early 1960s Popper has appealed to verisimilitude as a central 
aim of science, that is, we can aim for highly informative (and therefore 
improbable) theories which even though they may be false contain truth in 
excess of falsity with respect to their rivals. However this new aim outstrips 
the ability of the rules listed in section 3 to yield the aim; there is no 
guarantee that following the rules will yield the goal of increasing veri- 
similitude for our theories. In the absence of some realist purchase for the 
rules of method Popper’s pre-1960 account of them is only a conventionalist 
one. However in post-LSD writings Popper has attempted to link his rules 
of method to actual science thereby giving the rules an empirical rather 
than an a priori character. The remaining sections investigate this attempt 
to inject a realist element into m-falsificationism. 


6 OBJECTIVE PROGRESS IN THEORIES OF METHOD 


Theories of method are not static; as was envisaged in section 2, they have 
a history and can be said to grow (consider the attention paid to developing 
rules for the acceptance and rejection of statistical hypotheses not envisaged 
in most methodologies proposed before the mid-nineteenth century). How- 
ever if methodological rules are purely conventional not much can be said 
about growth in methodology except that the overall number of rules has 
increased. What one feels acutely about a conventionalist view of meth- 
odology is that nothing can be true or false, valid or invalid, in this disci- 
pline. Popper says of some of the rather obvious rules he proposes: ‘Pro- 
found truths are not to be expected of methodology’ (LSD, 54). But if the 
conventionalist view of methodological rules is correct no truths are to 
be expected at all, let alone any profundity. We cannot claim that one 
methodology is more true than another or is more valid in its application 
than another. However there is one place where Popper does come out in 
favour of the view that methodological rules have some truth content. 

In the ‘Addendum’ to the 1962 edition of the second volume of The Open 
Society and Its Enemies, Popper tells us: 


. .. nevertheless we may take the idea of absolute truth—of correspondence to the 
facts—as a kind of model for the realm of standards, in order to make it clear to 
ourselves that, just as we may seek for absolutely true propositions in the realm of 
facts or at least for propositions which come nearer to the truth, so we may seek for 
absolutely right or valid proposals in the realm of standards—or at least for better, 
or more valid, proposals (Popper [1962], pp. 385-6). 


The standards Popper refers to include not only the ethical standards we 
adopt of which we can say that they are right or wrong, good or bad, but 
also the standards expressed in our methodological rules. Popper is at pains 


470 Robert Nola 


to point out that there is a dualism of facts and standards and two important 
asymmetries between facts and standards. First, in the decision to adopt a 
certain proposal (convention) we thereby create a standard at least tenta- 
tively; but in making a decision to accept a proposition we do not thereby 
create a fact. Second, facts are evaluated by standards and not conversely. 
The asymmetries clearly suggest that Popper includes methodological rules 
under the rubric of standards. Both realms also have regulative ideals 
associated with them. In the realm of fact the regulative ideal is of the 
correspondence of a proposition with a fact (or, extending this, the veri- 
similitude of our theories). In the realm of standards the regulative ideal is 
captured by terms such as ‘right’ or ‘good’. Thus if our standards are moral 
standards then the regulative ideal concerns the rightness (as opposed to 
wrongness) or the goodness (as opposed to the badness) of the standards of 
morality we do in fact adopt. Popper admits however that this regulative 
idea is less clear than that in the realm of facts. Consider now standards 
which concern the appraisal of propositions or theories. Analogously, the 
regulative ideal for methodological standards we do in fact adopt concerns 
their rightness (as opposed to wrongness) or their validity (as opposed to 
their invalidity). Thus Popper seems to be setting the scene for a shift in 
his view of the status of methodologies. It does make sense to investigate 
to what extent any methodology approaches its regulative ideal of rightness, 
correctness or validity. In the quotation above Popper italicizes the word 
‘seek’. If methodological standards are conventions then we know in 
advance that there is nothing to be sought; it is we who create conventions. 
However if we seek ‘absolutely right standards’ then even when we fail to 
find them, or can never be sure that we have found them, the logic of 
the word ‘seek’ presupposes that success is in principle possible for our 
searching. 

The general drift of the above is that methodologies can in principle be 
appraised for their partial correctness. ‘Whatever we accept we should trust 
only tentatively, always remembering that we are in possession, at best, of 
partial truth (or rightness), and that we are bound to make at least some 
mistake or misjudgement somewhere—not only with respect to facts but 
also with respect to the adopted standards’ (Popper [1962], p. 391). Falli- 
bilism extends even to theories of method. However progress is discernible 
between fallible methodologies just as it is between fallible level I theories: 
‘One methodological theory, M, is better than another methodological 
theory M’, if and only if, M is more like the absolutely correct method of 
our regulative ideal than M’.’ If we give up the view that methodologies 
are truth-valueless conventions and accept something like the above view 
that methodologies can be appraised for their correctness or validity then 


' This is adapted from Sarkar [1983], p. 35. Sarkar has a quite useful discussion of a number 
of issues concerning Popperian methodology, in particular whether or not Popper’s rules 
are non-conventional and have truth-values. 


The Status of Popper’s Theory of Scientific Method 4771 


the next questions arises: what meta-methodological rules do we use to 
carry out this appraisal? None is suggested in the ‘Addendum’ but there is 
a distinct kind of answer Popper has advocated in post-LSD works since 
the 1953 paper ‘Philosophy of Science: A Personal Report’ better known 
under the title ‘Science: Conjectures and Refutations’; the same views have 
been more explicitly reiterated in his ‘Replies to My Critics’. 


7 AN EMPIRICAL BASIS FOR POPPERIAN METHODOLOGY 


So far theories of method have been elaborated, and attempts have been 
made to evaluate them, in total isolation from the history of science and the 
way scientists have regarded one another’s hypotheses. The grounds for 
this separation of methodology and actual science are based in part on 
Popper’s arguments against a naturalistic view of method and the logical 
gap that exists between the prescriptive oughts of methodological rules and 
the descriptive facts concerning the history of science and the behaviour of 
scientists. This shows only that certain kinds of relations between metho- 
dology and actual science are impermissible; it does not show that no 
relations whatever are possible between methodology and actual science. 
In fact a quasi-empirical base can be found from which a critical attack can 
be launched against methodologies. The nature of this base is best described 
by Popper in sections 5 to 7 of ‘Replies to My Critics’ (Schilpp (ed.), 
[1974]). 

First Popper picks out instances of great, or heroic, science, viz., those 
theories proposed and elaborated by Galileo, Kepler, Newton, Einstein and 
Bohr. The list could be extended indefinitely, but these are the central cases 
which, according to Popper, should appear on any list of great science. 
Popper is also adamant that certain instances of alleged science should not 
appear on the list, whether extended or not. The counter-instances include 
Marxism and Freudian and Adlerian psychology. How these instances and 
counter-instances are picked out is a crucial matter that will be addressed 
shortly. For convenience refer to the two lists of instances of, and counter- 
instances to, great science as sub-lists of the ‘basic list’. Consider now 
Popper’s reasons for adopting his proposed definition of science. In ‘Replies 
to my Critics’ Popper reiterates his demarcation criterion (much along the 
lines found in section 3) and emphasises once more the d-conventional 
character of any attempt to provide a definition of ‘science’ and then says: 


If I define ‘science’ by my criterion of demarcation (I admit that this is more or 
less what I am doing) then anybody could propose another definition, such as 
‘science is the sum total of true statements’. A discussion of the merits of such 
definitions can be pretty pointless. This is why I gave here first a description of 
great or heroic science and then a proposal for a criterion which allows us to 
demarcate—roughly—this kind of science (Schilpp (ed.), [1974], vol. 2, p. 981). 


The last sentence suggests a new kind of level III meta-methodological 
rule ought to be used to adjudicate between rival definitions of science 


472 Robert Nola 


(though Popper discusses only his own level II methodology in this respect). 
It bids us adopt that level II methodology that (a) picks from the basic list 
instances of great science and does not exclude any and (b) does not pick 
from the basic list any counter-instances of great science. Thus meth- 
odologies are fitted out with potential falsifiers once the basic list has been 
drawn up; they are refuted if they exclude an instance of great science or 
include a counter-instance. Popper is so strongly committed to what is in 
the basic list, especially the sub-list of counter-instances, that in reply to 
Lakatos’ question ‘Under what conditions would you [Popper] give up 
your demarcation criterion [t.e., level II methodology}? Popper answered: 
‘I shall give up my theory if Professor Lakatos succeeds in showing that 
Newton’s theory is no more falsifiable by “observable states of affairs” than 
Freud’s’ (Schilpp (ed.) [1974], vol. 2, p. 1010). The above suggests that the 
level III meta-rule is really a version of a level II m-falsificationist rule 
elevated to level ITI, viz., the rule for the falsification of theories based on 
Modus Tollens. In this sense we can regard Popper’s new level III meta- 
rule as belonging properly to an empirical view of methodologies; they are 
to be accepted or rejected with respect to a basic list of instances of, and 
counter-instances to, great science. 

Where does the basic list come from? This is crucial for it provides the 
test base against which methodologies are evaluated. Second, how secure 
are the judgements about the sub-list into which each theory is placed? As 
is well known, Popper has an anti-foundationalist view of observational, or 
basic, statements against which level I scientific theories are to be tested. 
These are allegedly adopted by convention (i.e., their acceptance or rejection 
is based on decisions) and are as fallible and revisable as the theories they 
test (see LSD, sections 28 to 30). Such anti-foundationalist fallibilism ought 
to be extended to meta-methodology as well as methodology as a matter of 
consistency (unless some compelling reason is given for abandoning such 
a characteristically Popperian feature of methodology at the meta-level— 
but none is forthcoming). Thus, where T is a theory, claims of the form “T 
is a [counter-] instance of great science’ are at least as fallible as Popper’s 
basic statement ‘Here is a glass of water’ (see LSD, 95). In fact more so; 
while one might feel quite secure about the latter claim in most circum- 
stances, one might well feel much more insecure about the former kind of 
claim in a large number of cases. More often than not Popper presents his 
basic list somewhat dogmatically without proper cognizance of the fallibility 
of judgements of the form “T is [is not] great science’ that his general 
epistemological stance calls for. 

When one considers how the basic list is arrived at it seems as if Popper 
plucks it out of the air; the list is based simply on Popper’s intuitions as to 
what is and is not great science. In contrast Lakatos suggests a different 
procedure.’ He appeals to the general community of scientists past and 


' See Lakatos [1978], pp. 123-31. 


The Status of Popper’s Theory of Scientific Method 473 


present, the so-called scientific élite. They differ as much as do philosophers 
over what counts as a definition of science or a theory of method. However 
it is supposed that they would generally agree that a particular theory was 
an instance of, or counter-instance to, great science, or that a particular 
move in the game of science from theory T to theory T’ was acceptable or 
unacceptable, while at the same time not agreeing, or even not under- 
standing, why, the move is acceptable or unacceptable. It is agreements that 
which determine the items on the basic list, not any agreements about how 
concordance is achieved (or ought to be achieved). Such agreements need 
not be reached over all sciences but just those cases that are the central ones 
on each sub-list. 

Appeal to the scientific élite raises problems of its own. First, criteria are 
needed for individuating members of the scientific élite whose judgements 
are then used to generate the basic list. On pain of circularity one cannot 
test for such scientists by asking whether they approve or disapprove of 
some list of alleged sciences or whether they adopt such-and-such a metho- 
dological view of science. Rather, it is the independently chosen élite who 
are to generate the basic list which in turn is used to test methodologies. 
The problem of individuating the élite is reminiscent of Popper’s second 
consideration against a naturalistic methodology; not only is the definition 
of ‘science’ a matter of convention but also who is to be called a ‘scientist’. 
In appealing to an élite problems about what is or is not great science have 
been pushed back to who is or is not a scientist. This is somewhat irritating 
not because it bases methodology on a sociological investigation into who 
is a scientist (which as a matter of fact it does not) but because we seem not 
to have really escaped our original difficulty of establishing a basic list; all 
we have done is replace a list of alleged sciences by a list of alleged scientists 
and contrasted it with a list of alleged charlatans of pseudo-science. 

Second, in eliciting a response from each member of the élite we must 
ensure that their answer is not tainted by an appeal to any methodology to 
ascertain whether or not some theory is a piece of great science. Each élite 
scientist must be told that his or her answer should not be arrived at by 
invoking any methodology to determine whether or not a theory, or some 
move in the game of science, is scientific. This would invalidate the purpose 
for which the list is being drawn up, viz., to test methodologies. Third, it 
is not clear to what extent, say, a research scientist dealing with ways of 
controlling grass grubs in pastureland can comment about, say, the alleged 
scientific character of theories in economics or Freudian psychology. Per- 
haps the best one can hope for is peer evaluation of one’s own field of work. 
But in that case perhaps all (or most of) the Freudian analysists would vote 
for the scientific character of Freudian psychology (though to my knowledge 
no such survey has been carried out). Finally, once the responses are in one 
needs a little statistical theory to present the overall result concerning what 
theories go where in the basic list. Is this statistical theory part of a theory 
of method or not? If it is, then we are using the theory before it has been 


474 Robert Nola 


properly validated; if not then there are some rules for accepting and 
rejecting hypotheses that lie outside the methodologies which stand in need 
of test. Of course, if such rules are founded outside methodology in, say, 
logic or mathematics, then our original problem as to how rules of method 
are to be justified has been solved for these particular cases. 

How worrying are these points? Since no genuine attempts have been 
made to carry out a survey of scientists along the lines suggested by Lakatos, 
it is hard to tell. At best we have only vague intuitions about the likely 
outcomes of such surveys and the problems of selecting the appropriate 
people to make their comments in a suitably controlled manner. What 
surveys there are concerning the views of scientists makes one much less 
sanguine about the outcome of the particular survey envisaged by Lakatos. 
Consider, for example, the range of opinion collected in Collins [1981] in 
a survey of the peer evaluation of the experiments carried out by Joseph 
Weber in an attempt to detect the presence of gravity waves. Whether the 
techniques and apparatus employed by him really detected gravity waves 
became highly controversial. In particular, the terms ‘scientific’ and ‘un- 
scientific’ were employed with respect to Weber’s work in differing ways 
by various respondents reported by Collins. Moving closer to our problem 
area, the results of Kern et al. [1983] in their paper ‘Scientists’ Under- 
standing of Propositional Logic: An Experimental Investigation’ would 
raise a few eyebrows in logical circles. One disconcerting result for Pop- 
perians, who place great emphasis on falsification, was of a survey of the 
views of a group of scientists as to whether Modus Tollens was a valid 
inference schema. Only 41 per cent gave the correct answer in the case of 
a simple abstract schema, which was improved to 69 per cent who answered 
correctly in the case of a simple substitution instance.' Clearly if scientists 
are going to be asked any remotely methodological questions the ground 
had better be properly prepared. Such caution is advisable when one con- 
siders the results of a survey into the influence, value and nature of Popper’s 
methodology on practising scientists (supporters such as Medawar, Eccles 
and Bondi aside).? Popperians would be dismayed in some cases not only 


l In conclusion Kern et al. say: ‘Deapite the limited nature of the tasks selected for inclusion 
in the present investigation, it is noteworthy that a substantial proportion of practising 
scientists leading successful research careers made gross errors of inference on a seemingly 
straightforward test of propositional inference’ (p. 144). Kern et al. also surveyed the 
scientists for their solution of Wason’s ‘four-card selection task’, a test highly relevant to the 
grasp of falsification in science. Only 18 per cent of the scientists correctly identified the two 
falsifying cards. 

? See Mulkay and Gilbert [1981] who go to the trouble of asking 34 biochemists a number of 
questions about their views on Popper and some of his methodological prescriptions. 
‘Though, of course, such a procedure must not fall foul of Popper’s arguments against a 
naturalistic account of methodology, it does reveal some of the difficulties about getting a 
survey of the scientific élite off the ground concerning questions about what is and is not 
scientific and of achieving a result of the sort that even remotely coincides with Popper’s 
intuitions about what is and is not great science, or what is or is not an accept- 
able move in the game of acience. 


The Status of Popper’s Theory of Scientific Method 475 


by the responses but by the level of understanding of what Popper’s rules 
amount to by the surveyed and the surveyors alike. Once again some 
divergence can be noted in the use of the predicates ‘scientific’ and ‘unsci- 
entific’. The last two surveys have no relevance to the test of methodologies; 
Popper’s argument against a naturalistic theory of method guarantees that. 
But while the surveys do not bear directly on a survey of the sort envisaged 
by Lakatos they do indicate not only problems in carrying out the surveys 
but also the equivocal character of the outcome in part due to the irration- 
ality exhibited by scientists in the above surveys. This could well become 
important in cases of evaluating sciences which are highly controversial in 
character. 

Freudian psychology is a case in point. In considering philosophers only, 
there is a marked contrast in, say, Grinbaum’s intuitions about the worth of 
Freud’s theories (independently of his arguments for its scientific character) 
when compared with Popper’s intuitions. In addition, even Popper’s 
intuitions about what theories are, or are not, scientific has changed.! 

Why has meta-methodology been dragged from its rarefied heights down 
into the mire of sociological research? This was to find the set of potential 
falsifiers for methodologies of the form “T is [is not] a piece of great science’ 
against which to test level II methodologies. But is this the only way to 
have a genuinely empirical approach to meta-methodology? May there not 
be other ways of empirically evaluating methodologies without appeal to 
the intuitions of a scientific élite about various matters? Such an approach 
is suggested by Lakatos? and Brown [1980]; they propose meta~metho- 


1 I refer here in particular to Griinbaum [1984] and his series of articles, still continuing, on 
the scientific status of Freud's theories. Popper’s views on Freud’s theories in ‘Science: 
Conjectures and Refutation’ (Popper [1963], pp. 34-8) are well known and are reiterated in 
‘Replies to My Critics’: ‘. . . psychoanalysis was immune [to criticism] to start with, and 
remained so’ (Schilpp (ed.) [1974], p. 985). However Popper does admit on the same page: 
‘It is an interesting psychological metaphysics (and no doubt there is some truth in it, as 
there is so often in metaphysical ideas), but it never was a science.” Further Popper asserts 
in § 18, p. 164, of Popper [1983]: ‘I at least feel convinced that there is a world of the 
unconscious, and that Freud’s analyses of dreams given in his book are fundamentally 
correct, though no doubt incomplete (as Freud himself makes clear)... .’ Yet Popper’s 
complaint is that Freud argued uncritically for his theory, not that Freud’s theory was itself 
immune to criticism. Whether or not Freud defended his theory badly, it is the latter that 
is more important. It seems distinctly odd to conclude that Freud’s theory of the unconscious 
and the analysis of dreams is true yet is a purely metaphysical theory without any link with 
experience. Rather, Popper’s considerations are too sketchy, as Griinbaum has shown. 
Freud’s theory constitutes a real stumbling block for Popper’s basic list of what is to be 
regarded as a science in the first place. Perhaps Glymour has put his finger on the problem 
when he gays, while admitting that there is much bad, as well as good, argument in Freud’s 
work: “The mystery of reason is more vivid and more urgent in Freud than in the works of 
any other modern writer, and far outstrips the power of our epistemologies to unravel it’ 
(Glymour [1982], p. 31). In fact a consideration of Freud’s procedures is in part responsible 
for Glymour’s bootstrap theory of evidential support. For a more convincing case of a change 
in view about the status of a theory see Popper’s comments on Darwin’s theory of evolution; 
once he thought this was unfalsifiable but since 1977 he has argued for its scientific status. 
See section 2 of Popper [1978]. 

? Lakatos [1978], section 2b of chapter 2. 


476 Robert Nola 


dological rules that bid us maximise the rationality implicit in human 
intellectual endeavours, and in particular, science. A discussion of their 
meta-methodologies is outside the scope of this paper. 

Let us grant that we have a set of potential falsifiers of the previously- 
mentioned sort for level II methodologies. How does Popper’s level II m- 
falsificationism fare when judged on the basis of the falsificationist level III 
meta-rule? Lakatos’ verdict, which will not be evaluated here, is that it is 
falsified by its own meta-rule.' If this is so, what is the virtue in having a 
level III falsificationist meta-rule when level II m-falsification is to be 
abandoned? 

Finally, there is an argument, hinted at in the above, in defence of 
Popper’s empirical meta-rule that is definitely wrong. Popper uses his own 
definition of science to show that Marxism and Freudian psychology are 
unscientific because they use conventionalist stratagems in the face of refu- 
tations. Let us suppose that they do use such stratagems. If these two 
theories are declared, for this reason, to be counter-instances to great science 
then they cannot be included on the basic list as instances which any 
methodology, and Popper’s in particular, must exclude from the realm of 
science. Such a procedure is blatantly circular. Without attributing this 
procedure to Popper it is not always clear in what he says about the empirical 
meta-rule outlined in this section that such an argument is not intended. 
The argument can be set out as follows. 

Consider the following argument where “T” stands for any theory and ‘H? 
and ‘F’ stand for Marx’s historical materialism and Freudian psychology 
respectively: 


(1) If any theory T uses conventionalist saving strategems then T is 
not scientific 
(2) E, and H, use conventionalist saving stratagems 
<. (3) F, and H, are not scientific. 


Premise (1) arises from a rule of m-falsificationism. Premise (2) is part of 
what, following Brown [1980], we may call a theoretical reconstruction of an 
episode in the history of human thought. Brown takes seriously Lakatos’ 
dictum that ‘history of science without philosophy of science is blind’.? If 
the historiography of science is to be more than a mere chronicle, concepts 
from any level II methodology, whether Bayesian, conventionalist, falsi- 
ficationist, or whatever, must be employed. For example, after researching 
some episode in the history of science we may discover that a correct 
description of the behaviour of some scientist S is as follows: ‘S proposed 
a bold theory with respect to its rivals and conducted a crucial experiment 
which falsified it; to save the theory S added some ad hoc auxiliary hypo- 
theses and then looked for confirming instances of his revised theory.’ This 


1 Lakatos [1978], section 2a of chapter 2. 
`? Lakatos [1978], p. 102. 


The Status of Popper’s Theory of Scientific Method 477 


description employs concepts such as bold, crucial experiment, ad hoc and 

confirming instances which are part of, and are explicated by, various metho- 
dologies. We can say that this description is theoretical in that it employs 
concepts characteristic of level II methodologies and it is this that lifts the 
account above that of a methodologically unenlightened chronicle. Scientist 
S may well be confused about scientific methodology in that an account of 
his behaviour involves concepts from rival methodologies and a violation 
of their rules. However this is a judgement to be passed on S’s behaviour 
based on the prescriptions of some methodology and not just its characteristic 
concepts. Such a judgement would be part of what Brown calls a rational 
reconstruction of the historical episode and not merely a theoretical recon- 
struction of the episode which passes no judgements whatsoever. Rational 
reconstructions, in Brown’s sense, tell us how scientists ought to have 
behaved in the light of the prescriptions of some set of methodological rules 
whether they behaved that way or not. Theoretical reconstructions are 
purely descriptive of some episode in the history of science and do not 
commit us to endorsing the rules of some theory of method. Of course, they 
do commit us to some theoretical concepts of scientific method in that the 
historiography is peppered with terms like ‘degenerating research pro- 
gramme’, ‘high probability’, etc.; but this merely gives us the sense in which 
the historiography is a theoretical reconstruction and not a bare chronicle. 
The distinction between theoretical and rational constructions is important 
in that it enables us to see the fallacy in arguments, all too common, to the 
effect that if an episode in the history of science is written in accordance 
with some theory of method, then that methodology is, somehow, endorsed 
or self-authenticating. (Brown effectively uses his distinction to disarm 
such arguments.) 

Turning back to the above argument, premise (2) is a claim based on a 
theoretical reconstruction of some episodes in the history of science. Such 
a claim can be made only after a careful examination of the relevant episodes 
in the history of science and after the careful application of methodological 
concepts. It is a matter of considerable historiographical controversy as to 
whether Popper is correct in premise (2) about H andF. For the sake of the 
argument let us suppose that Popper is correct. Then conclusion (3) is true 
and validly follows from these premises. Now Popper often argues against 
H and F, especially in ‘Science: Conjectures and Refutations’ (in Popper 
[1963]) in the following way. He notes that Einstein’s theory E does not 
employ conventionalist saving stratagems. Supposing that this is a correct 
theoretical reconstruction of E then it is an important difference between 
E on the one hand and H and F on the other. But merely noting this 
difference neither supports nor refutes the view that these are sciences. 
Only when premise (1), based on a methodological rule, is added does any 
conclusion about the scientific character of these theories follow. What 
must be avoided is an argument to the effect that claims of the sort in 
premise (2) alone support the view that H and F are unscientific or that 


478 Robert Nola 


they can be used as a test of a methodology. What Popper says in the early 
sections of this paper could be construed in this way; if so the argument 
would be seriously incomplete. 

Let us suppose that we now have a Griinbaumian intuition and make the 
following judgement: 


(4) F is a piece of great science. 


Then either (1) or (2) is false. If we assume that premise (2) is correct 
because the theoretical reconstruction has been well done, then premise (1) 
based on the methodological rule is refuted and the methodology is in 
serious trouble. We can now ask where (4) comes from. This is the problem 
we have been discussing, viz., how is the basic list to be drawn up? Setting 
the argument up this way shows that however (4) is arrived at it must be 
supported in ways independent of methodological considerations. This is 
what is so hard about generating claims like (4) (or their negation) in order 
to draw up the basic list in ways other than by appeals to gut intuition. It 
may be a good reason for abandoning this approach to an empirical view of 
meta-methodology. That would not necessarily close all avenues to an 
empirical, as opposed to an a priori, account of meta-methodology.' 

Contrast the above with the following procedure (where the T; are 
theories): 


(a) Ti, T,,..., Tn are instances of great science while T,,.;,...,T,, are not. 
(b) T,, T2,..., Ta share methodological characteristics C (e.g., they do not 
employ conventionalist saving hypotheses) while T,,1,..., Tm lack C 


(e.g., they do employ conventionalist saving hypotheses). 


After noting (a) and (b) make the following proposal: adopt the metho- 
dological rule which accords with the basic judgements expressed in (a) and 
let the presence of the shared characteristics C be what the rule applauds 
and let their absence be what the rule condemns. These may not be the 
only grounds on which the rule is adopted; it may be that the shared 
methodological characteristics are also in accordance with some cognitive 
aim as well. This point aside, the methodological rule is adopted because 
it captures best the intuitions expressed in (a), viz., the basic list. This seems 
to be one of the ways Popper argues in support of his m-falsificationism 


' One way suggested by Popper’s conception of a theory of method, but hardly explored by 
him, is to judge exactly how well rules R are conducive to bringing about goal G. That R 
does, or does not, bring about G is an empirical matter to be determined in the same way 
as any regularity of the form: employing means M brings about end E always, or with a 
certain statistical regularity. Thus any methodology (G, R) will have empirical conditions 
of adequacy of R to G. Such a view, common in the theory of action which deals with means 
and ends but relatively unexplored in meta-methodology, could provide a new way of 
understanding meta-methodology empirically. 


The Status of Popper’s Theory of Scientific Method 479 


in ‘Science, Conjectures and Refutations’ and elsewhere. Setting out his 
procedure in this way still leaves one with the question: on what grounds, 
if any, is the basic list (a) drawn up? Clearly methodological rules cannot 
be appealed to as they are arrived at as a conjecture designed to accom- 
modate (a). 


8 CONCLUSIONS 


While Popper has raised significant questions concerning the form and 
content of level II methodologies and raised the need for a level III meta- 
methodology, no unitary picture emerges as to what an overall view of both 
methodology and its meta-theory is like. We are left with two different 
meta-methodological theories, one a priori the other empirical in character, 
each of which is at odds with Popper’s own level II falsificationist metho- 
dology in varying ways. In the rarefied atmosphere of meta-theorising 
significant and contentful meta-rules are hard to come by. But, in so far as 
they can be formulated, Popper’s a priori meta-rules turn out to be cor- 
relates of those banned at level II; and his empirically based meta-rule 
either knocks out his level II methodology (if Lakatos is to be believed) or 
the empirical conditions of its application seem hardly better founded than 
gut reactions about what is or is not science. It may also be disconcerting 
to find that there are distinct sets of meta-methodological rules. However 
there is nothing obviously incompatible about Popper’s a priori and empiri- 
cal meta-theories; so they are not rivals at level III and we are not obliged 
to ascend further to level IV to seek a way to resolve their rivalry. However 
their distinctive character does not suggest that they comprise a unified 
theory of meta-method; this is marked by labelling them ‘a prior? and 
‘empirical’ meta-methodologies. Given this diversity at level II] one may 
well feel sceptical of the drive for methodological purity and unity at the 
lower level. This paper has not been concerned with the merits of m- 
falsificationism which has been partially specified in section 3 (t.e., logical 
and epistemological criticisms of Popper’s rules ‘from below’ have been set 
aside). However it does raise a question about its being the single level II 
methodology and its viability as a level III meta-methodology. 

At the beginning the possibility of an overarching methodology CR, 
viable for all disciplines, was raised. In the course of the paper its prospects 
were considerably diminished. The empirical meta-rule of the previous 
section could not serve as a more concrete version of CR. Even though this 
meta-rule is consistent with the general directive of CR to criticise our 
theories and to attempt to refute them, the kind of refutation envisaged is 
severely restricted to the sciences. Whatever the prospects for a viable meta- 
methodology, the case for an overarching CR has hardly been advanced; it 
still remains the almost contentless directive we met at the beginning. 


Department of Philosophy 
University of Auckland 


480 Robert Nola 
REFERENCES 


Boyn, R. N. [1983]: “The Current Status of Scientific Realism’, in J. Leplin (ed.), Scientific 
Realism, pp. 41-82. Berkeley: University of California Press. 

BROWN, J. R. [1980]: ‘History and the Norms of Science’, in P. D. Asquith and R. N. Giere 
{eds.), PSA 1980, vol. 1, pp. 236-47. East Lansing: Philosophy of Science Association. 

CoLLINS, H. M. [1981]: ‘Son of Seven Sexes: The Social Reconstruction of a Physical 
Phenomenon’. Social Studies of Science, vol. 11, pp. 33-62. 

Crick, F. and MITCHISON, G. [1983]: ‘The Function of Dream Sleep’, Nature, 304, pp. 111- 
14. 

Evans, C. [1983]: Landscapes of the Night. London: Coronet. 

FEYERABEND, P. K. [1978]: Science in a Free Society. London: NLB. 

GIERE, R. N . [1985]: ‘Philosophy of Science Naturalized’, Philosophy of Science, 52, 331-56. 

GLYMOUR, C. [1982]: ‘Freud, Kepler and the Clinical Evidence’, in R. Wollheim and J. 
Hopkins (eds.), Philosophical Essays on Freud. Cambridge: Cambridge University Press. 

GRÜNBAUM, A. [1976]: ‘Ad Hoc Auxiliary Hypotheses and Falsificationism,’ The British 
Journal for the Philosophy of Science, 27, 329-62. 

GRUNBAUM, A. [1984]: The Foundations of Psychoanalysis: A Philosophical Critique. Berkeley: 
University of California Press. 

HINTIKKA, J. [1973]: Logic, Language-Games and Information. Oxford: Clarendon Press. 

Kern, L. H., Mires, H. L. and HINSHAW, V. G . [1983]: ‘Scientists’ Understanding of 
Propositional Logic: An Experimental Investigation’, Social Studies of Science, 13, pp. 
131-46. 

LAKATOS, I. [1978]: The Methodology of Scientific Research Programmes: Philosophical Papers: 
Volume I. Cambridge: Cambridge University Press. 

Muvkay, M. and GILBERT, G. N. [1981]: ‘Putting Philosophy to Work: Karl Popper’s 
Influence on Scientific Practice’, Philosophy of the Social Sciences, vol. 11, pp. 389-407. 

MUSGRAVE, A. [1983]: ‘Facts and Values in Science Studies’, In R. W. Home (ed.), Science 
Under Scrutiny. Dordrecht: D. Reidel. 

POPPER, K. R. [1957]: The Poverty of Historicism. London: Routledge and Kegan Paul. 

Popper, K. R. [1959]: The Logic of Scientific Discovery. London: Hutchinson. 

Popper, K. R. [1962]: The Open Society and its Enemies, vol. 2. London: Routledge and Kegan 
Paul. 

POPPER, K. R. [1963]: Conjectures and Refutations. London: Routledge and Kegan Paul. 

Popper, K, R. [1972]: Objective Knowledge. Oxford: Oxford University Press. 

Popper, K. R. [1976a]: “The Logic of the Social Sciences’, in T. W. Adorno et al., The 
Positivist Dispute in German Sociology. London: Heinemann. 

Popper, K. R. [1976b]: “The Myth of the Framework’, in The Abdication of Philosophy. 
Philosophy and the Public Good. Essays in Honour of Paul Arthur Schilpp, ed. E. Freeman, 
pp. 23-48. La Salle, Illinois: Open Court. 

POPPER, K. R. [1978]: ‘Natural Selection and the Emergence of Mind’, Dialectica, 32, 
PP. 339-55. 

Popper, K. R. [1983]: Realism and the Aim of Science. Totowa, N.J.: Rowman and Littlefield. 

PUTNAM, H. [1981]: Reason, Truth and History. Cambridge: Cambridge University Preas. 

SARKAR, H. [1983]: A Theory of Method. Berkeley: University of California Press. 

SHAPERE, D. [1980]: “The Character of Scientific Change’, in T. Nickles (ed.), Scientific 
Discovery, Logic and Rationality, Boston Studies in the Philosophy of Science, vol. 56. 
Dordrecht: D. Reidel. 

SCHILPY, P, A. (ed.) [1974]: The Philosophy of Karl Popper. La Salle, Illinois: Open Court. 

STROKER, E. [1984]: ‘Does Popper’s Conventionalism Contradict His Critical Rationalism? 
Objections Against Popper in German Philosophy and Some Meta-Critical Remarks’, in 
R. Cohen and M. Wartofsky (eds.), Methodology, Metaphysics and the History of Science, 
pp. 263-82. Dordrecht: D. Reidel. 

WATKINS, J. [1984]: Science and Scepticism, Princeton: Princeton University Press. 


Brit. J. Phil. Sct. 38 (1987), 481-499 Printed tn Great Britain 481 


Tests of Significance Following 
R. A. Fisher’ 


by D. J. JOHNSTONE 


I Specification 
Procedure 
3 Interpretation 
(i) Probability 
(ii) The Level of Significance P 
(iii) Fisher’s Logic 
(iv) The Logic for Inexact Tests 
(v) Fisher on Support 
(vi) The Logic for Repetition 
(vii) The Critical Level a 
(viii) Fisher's Dualism 


N 


4 Conditionality 

5 On ‘Repeated Sampling from the Same Population’ 
6 Seidenfeld’s Argument 

7 Samples Alike in all ‘Relevant’ Respects 

8 The ‘Random’ Sample 


I SPECIFICATION 
A test of significance following Sir Ronald Fisher is specified by: 


(i) The null hypothesis k,. Typically, h, is a statistical hypothesis concerning 
a variable x over a population I.” 


1 I wish to thank Professors D. R. Cox (Imperial College, London), I. J. Good (Virginia 
Polytechnic Institute and State University), H. E. Kyburg (University of Rochester), D. V. 
Lindley (University College, London), J. W. Pratt (Harvard University) and T. Seidenfeld 
(Washington University) for helpful comments on the initial version of this paper. 

2 There is one exception. The null hypothesis in a ‘randomization’ test is that r, = ry. = 

= fu =... =7,r for all u, whatever the experimental conditions, where r,, denotes the 
response on 1 experimental unit u to treatment t, and where T denotes the number of treat- 
ments under comparison. This hypothesis is not statistical, but deterministic. Cox and 
Hinkley [1974, p. 196] noted that the null hypothesis in a randomization test is deterministic. 
Note that the terms ‘randomization’ test and ‘permutation’ test have at times been used 
interchangeably. Kempthorne and Doerfler [1969, pp. 234-5] distinguished properly 
between these tests. The null hypothesis in a permutation test is statistical, not deterministic. 
Fisher’s nonparametric test with Darwin’s data [1935, pp. 44-8] is an example of a per- 
mutation test. The null hypothesis in this test is that the difference between the heights of 
cross and self-fertilized maize plants is distributed over a specific population of units (sites) 
with zero mean. This is a statistical hypothesis. 


Received January 1985 


482 D.J. Johnstone 


Gi) The sampling ‘rule’. The sampling ‘rule’ is a routine with which a 
random sample X = {x,xx;...x,} is drawn from the population IT. 


(iii) The reference set R. In the absence of any ‘ancillary’ statistic, the 
reference set R is simply the set of samples S = {X,X,X,...} generated (on 
the null hypothesis and sampling rule) by (hypothetical) repeated sampling 
from the population JI. If there is an ‘ancillary’ statistic A = A(X), the 
reference set R is the subset of the sample space S ‘conditioned’ on the value 
F(X), i.e. R includes only those samples Xes with the same value A(X) as 
the sample X observed. 


(iv) The test statistic t = t( X). The statistic t = t( X} is a measure of the 
‘discrepancy’ of the sample X with (respect to) the null hypothesis h,. 
Fisher used the term ‘discrepancy’ in a primitive (undefined) sense, e.g. 
[1925] pp. 318, 321. Note, however, that the ‘discrepancy’ of the sample X 
coincides in Fisher’s examples, at least ordinally, with the inverse of the 


probability (density) p(z|A,). 


(v) The ‘critical’ (Neyman’s term) level of significance a. 


2 PROCEDURE 


The procedure in a test of significance following Fisher is to measure the 
level of significance P = P(X) of the sample X. The level of significance P 
of the sample X is defined as the probability on the null hypothesis h, of a 
sample as discrepant with h, as the sample X. If the level of significance P 
of the sample X is less than or equal to the critical level a, the procedure is 
to ‘reject’ the null hypothesis A, at the a level of significance. If the level of 
significance P is greater than the ‘critical’ level g, the orthodox Fisherian 
procedure is to ‘not reject’ (but not to ‘accept’) the null hypothesis h, at the 
a level. 


3 INTERPRETATION 
(i) Probability 


Fisher maintained that a ‘probability’ is a state of certainty (or uncertainty) 
in a particular single case. This was a general definition: 


Clearly, the purpose of the notion of probability is to express—and express accu- 
rately, with mathematical precision—a state of uncertainty . . . Probability is, I 
suggest, the first example of a well specified state of logical uncertainty. [1958, p. 
263] 


Thus, following Fisher, the ‘probability’ p(£) of the event E is the degree 
of certainty (Carnap’s ‘probability,’') of the event E in the single case in 


' Carnap [1950, pp. 23~5] identified the primitive concepts ‘probability,’ (i.e. ‘inductive’ 
probability) and ‘probability,’ (f.e. ‘stochastic’ probability). Probability, represents degree 
of confirmation, and probability, represents relative frequency (in the ‘long run’). This is a 
useful and convenient terminology. 


Tests of Significance Following R. A. Fisher 483 


question. This is the ordinary interpretation within theory for inductive 
inference. The issue in inductive inference is how in general to measure 
probability, in the single case. Fisher’s position is that the probability, p(E) 
of the event E in the single case X is given by the relative frequency (Carnap’s 
‘probability,’) of the event E in the reference set R (where XeER), but 
(strictly) on the condition that the reference set R is free from any recog- 
nizable ‘relevant’ (with respect to E-ness) subset, e.g. [1956, p. 35]; [1958, 
p. 263]; [1959, p. 23]; [1960, p. 5]; [1962, p. 17]. In Fisher’s terms, a subset 
R, from the reference set R is a ‘relevant’ (with respect to E-ness) subset if 
the relative frequency f(E) measured over R, is not the same as the relative 
frequency f(E) measured over R, t.e. the subset R, is a ‘relevant’ subset if 
SE|XER,) # f(E|XER), e.g. [1958, pp. 263~7]; [1959, pp. 23-6]. Thus, for 
Fisher, the probability p(£) is a legitimate probability, in the single case X 
if and only if our ignorance is such that we cannot recognize any subset R, 
such that /(E|XeER,) # f(E|XeER). This is fisher’s principle of ‘necessary 
ignorance’. He wrote: 


The necessary ignorance is specified by our inability to discriminate any of the 
different sub-aggregates [subsets] having different frequency ratios, such as must 
always exist. [1956, p. 36] 


Logically, the principle that the reference set R must be free from any 
recognizable relevant subset is unnecessarily strong. It is sufficient that the 
reference set R, in the single case X, is free from any recognizable relevant 
subset {. .X..}, where a subset {. .X. .} is a subset which includes the single 
case X. It seems that Fisher intended this more explicit requirement. In 
his classic Statistical Methods and Scientific Inference, he began customarily 
with the principle that the reference set R must be free from any recog- 
nizable relevant subset [1956, p. 35], but later in the same work he stated 
more specifically that R must be free from any recognizable relevant subset 
{. .X. .}, ie. free from any recognizable relevant subset which includes the 
single case X: 


The subject of a statement of probability must not only belong to a measurable set, 
of which a known fraction fulfils a certain condition, but every subset to which it 
belongs, and which is characterized by a different fraction, must be unrecognizable. 
[1956, p. 60] 


It would seem that the general principle that the reference set R must be 
free from any recognizable relevant subset is shorthand in Fisher’s writing 
for the more explicit principle that R must be free from any recognizable 
relevant subset {. . X. .}. 

Fisher was not a frequentist, at least not in the pure sense of say Peirce, 
von Mises or Neyman. Probability in the frequentist sense is merely relative 
frequency (probability,). But Fisher was interested in probability (prob- 
ability,) in the single case. He maintained that probability in the frequentist 
sense (probability,) is logically empty: 


484 D.F. Johnstone 


. . consideration is given to the meaning that the word probability must have to 
anyone so practically interested as is a gambler, who, for example, stands to gain 
or lose money in the event of an ace being thrown with a single die. To such a man 
the information supplied by a familiar mathematical statement such as: “If a aces 
are thrown in n trials, the probability that the difference in absolute value between 
ajn and 1/6 shall exceed any positive value £, however small, shall tend to zero as 
the number n is increased indefinitely”, will seem not merely remote, but also 
incomplete and lacking in definiteness in its application to the particular throw [the 
single case] in which he is interested. Indeed, by itself it says nothing about that 
throw. [1956, p. 34] 


(ii) The Level of Significance P 


The level of significance P of the sample X is (by definition) the probability 
on the null hypothesis h, of a sample as discrepant with the null hypothesis 
h, as the sample X. Now, on Fisher’s account: (i) a ‘probability’ is a 
probability, in the single case, and (ii) the probability, p(E) of the event E 
in the single case X is given by the relative frequency (probability) of the 
event E in the reference set R, but on the condition that the reference set 
R is free from any recognizable relevant (with respect to E-ness) subset 
{. .X. .}. Hence, consistent with this general definition: (i) the level of 
significance P of the sample X is the probability, (on the null hypothesis 
h,) in the single case ‘X’, i.e. the single case in which X is the sample, of a 
sample as discrepant with the null hypothesis k, as the sample X, and (ii) 
the probability, P is given by the relative frequency with which samples in 
the reference set R are as discrepant with the null hypothesis A, as the 
sample X, but on the condition that the reference set R is free from any 
recognizable relevant (with respect to P) subset {. .X. .}. Thus, following 
Fisher, the reference set in a test of significance must be free from any 
recognizable relevant (with respect to P) subset {. .X. .}. Otherwise, P is 
not a legitimate (logical) probability, in the single case. Indeed, Fisher 
made the matter of relevant subsets the cardinal issue in his dispute with 
Neyman and Pearson. He berated Neyman and Pearson repeatedly on 
the basis that they ignored recognizable relevant subsets. Consider, for 
example, his remarks in regard to Student’s t-test: 


[Student’s t-test] is more simple than will generally be the case in statistical work, 
for in this case no characteristic of the sample (t.e. of the whole body of observations 
available) can be found to define a subset to which our sample belongs, and which 
might exhibit a different and more relevant, frequency distribution. It is this 
simplicity that has deceived those writers [Neyman and Pearson] who have con- 
sidered this one alone of the practically useful tests of significance into ignoring 
such subsets, or thinking that when such subsets are available their existence can 
be ignored . . . [1956, p. 86]! 


1 Note that Fisher’s claim that the reference set in a t-test is free in every case from any 
relevant subset, i.e. any relevant subset defined by a statistic f(x), is strictly not correct, of. 
Brown [1967]. 


Tests of Significance Following R. A. Fisher 485 


Fisher wrote in general about relevant subsets in his work on ‘probability’, 
i.e. probability, in the single case. He wrote at length, often in connection 
with his ‘fiducial argument’. If we are to understand Fisher’s tests of 
significance we must take account of his work on probability, especially his 
thoughts on relevant subsets. But this has rarely been the case.' Consider, 
for example, the position taken by Oscar Kempthorne, who is perhaps the 
leader in the modern school loyal to Fisher. Kempthorne [1979, p. 202] 
excludes Fisher’s work on relevant subsets explicitly, including specifically 
his remarks quoted above. 


(iii) Fisher’s Logic 


The object in tests of significance following R. A. Fisher is to so discredit 
the null hypothesis k, that h, might practically be excluded. Indeed, Fisher 
declared that experiments are conducted solely to allow data the opportunity 
to ‘disprove’ the null hypothesis: 


Every experiment may be said to exist only to give the facts a chance of disproving 
the null hypothesis. [1935a, p. 16] 


A test of significance yields a level of significance P. Fisher claimed that 
the level of significance P entails a simple disjunction: etther (the null 
hypothesis is true and) an event of probability, P has occurred or the null 
hypothesis is false. Hence, the lower the level of significance P, the stronger 
the evidence against the null hypothesis. Moreover, if the level of sig- 
nificance P is very low, e.g. P= 5%, the null hypothesis is successfully 
discredited: 


The force with which such a conclusion is supported is logically that of a simple 
disjunction: Either an exceptionally rare chance has occurred, or the theory of 
random distribution [the null hypothesis] is not true. [1956, p. 42] 


Fisher maintained that a test of significance can not prove or support the 
null hypothesis, or provide in any way for its acceptance. He insisted that 
levels of significance P which are not very small constitute insignificant 
evidence against the null hypothesis, not evidence for it: 


. . the test can never lead us to assert [the null hypothesis] that the two populations 
are identical... We can only say that the evidence provided by the data is insufficient 
to justify the assertion that they are different. [1936, p. 58] 


The two classes of results which are distinguished by our test of significance 
are, on the one hand, those which show a significant discrepancy from [the null 
hypothesis]; and on the other hand, results which show no significant discrepancy 
from this hypothesis. [1935a, pp. 16-17] 


' There are isolated references in the literature on tests of significance to Fisher’s work on 
relevant subsets. These include, most obviously, Seidenfeld, T. [1979, pp. 71-3]. Others are. 
Buehler [1959, p. 847], Lindley [1971, p- 16], Pierce [1973, p. 241], Robinson [1976] pah y 
Rosenkrantz [1977, p. 216]. 

7 ri 

eS x 

y 4 


\ el 


2 


Wo 


486 D.J. Johnstone 


Fisher described the level of significance P as an objective measure of the 
evidence on which we should disbelieve the null hypothesis, or be reluctant 
at least to accept that hypothesis. He insisted that P is a measure of inductive 
evidence, but that it is not in any way the inductive probability (prob- 
ability,) of the null hypothesis, t.e. P = P(X) # p(h,|X): 

Though recognizable as a psychological condition of reluctance, or resistance to the 
acceptance of a proposition, the feeling induced by a test of significance has an 
objective basis in that the probability statement on which it is based is a fact 
communicable to, and verifiable by, other rational minds. The level of significance 
in such cases fulfils the conditions of a measure of the rational grounds for the 
disbelief it engenders. It is more primitive, or elemental than, and does not justify 
any exact probability statement about the proposition. [1956, pp. 46-7] 


(iv) The Logic for Inexact Tests 


A null hypothesis which is nonparametric or composite must always reduce 
to a logical disjunction (A; or h, or h, or . . .), where each disjunct h is a 
simple parametric hypothesis. If the level of significance P is the same on 
each disjunct h, then the test is described as ‘exact’. If the level of sig- 
nificance P is not the same on each disjunct A, the test is ‘inexact’. Fisher 
added to his logic (i.e. his logic for exact tests) to admit inexact tests, by 
way of the axiom that a disjunction h, = (k, or h, or h, or . . .) is discredited 
by the sample X to the extent exactly that the least-discredited disjunct 
hypothesis h c h, is discredited. Thus, the level of significance P on the 
null hypothesis h, = (h, or h, or h; or . . .) is the highest level of significance 
P, over hch,, where P, denotes the level of significance P on the disjunct 
h. This account of Fisher’s logic follows Seidenfeld [1979, pp. 95-6], who 
explained that to admit inexact tests Fisher had only to introduce the axiom 
that a disjunction is discredited as much exactly as the least-discredited 
disjunct. Fisher implied this axiom, but his own account is not so elegant. 
Rather, he began again with his ‘etther—or’ disjunction, with polemics: 

If we are speaking of a composite family of hypotheses the position might be: either 
no one of these hypotheses is true, or an event has occurred the probability of which 
is less than or equal to P for any hypothesis of the family. If the matter had not 
been confused by half-understood slogans it would be universally accepted that the 


level of significance of any test is set by the greatest frequency, among the family 
of hypothesis under consideration, with which the criterion is surpassed. [ 1960, p. 8] 


(v) Fisher on Support 


Fisher stated that tests of significance do not have any logical facility to support 
the null hypothesis. This is the ‘official’ position in the modern school 
loyal to Fisher.’ Kalbfleisch and Sprott provide a clear but typical state- 
ment: 


1 There is an argument fairly common in Fisherian discussion which is said to demonstrate 
logically that a high level of significance can not be construed as support for the null 
hypothesis. The idea is that an outcome which has a high level of significance on the 
null hypothesis has similarly high levels of significance on alternatives ‘around’ the null 
hypothesis. Moreover, there are always alternative hypotheses which generate (much) higher 
levels of significance (given the sample X) than the level of significance P observed. 


Tests of Significance Following R. A. Fisher 487 


... a large significance level indicates that, with respect to the particular test used, 
the data provide no evidence that this hypothesis is false. This should not be 
interpreted as evidence in support of the hypothesis, but merely as a lack of evidence 


against it. [1976, p. 265] 


It would appear that the Fisherian position on support for the null hypoth- 
esis is well settled, yet at times Fisher himself seemed uncertain. Consider, 
for example, his remarks on the notion in Neyman-Pearson theory that a 
statistical test can lead us to ‘accept’ the null hypothesis: 


[The] point that the X’ test, like the other tests of significance, is cogent for the 
rejection of hypotheses, but, in the opposite case, by no means cogent for their 
acceptance, deserves to be widely appreciated. For the logical fallacy of believing 
that a hypothesis has been proved to be true, merely because it is not contradicted 
by the available facts, has no more right to insinuate itself in statistical than in other 
kinds of scientific reasoning. Yet it does so only too frequently. . . . It would, 
therefore, add greatly to the clarity with which the tests of significance are regarded 
if it were generally understood that tests of significance, when used accurately, are 
capable of rejecting or invalidating hypotheses, in so far as these are contradicted 
by the data; but that they are never capable of establishing them as certainly true. 


[1935¢, p. 474] 


In these remarks, Fisher stated without qualification that a test of sig- 
nificance can not prove the null hypothesis, however he chose not to 
comment, as if lacking conviction, on whether a test might support the null 
hypothesis to a lesser degree. Moreover, in more recent work, he suggested 
explicitly, if somewhat reluctantly, that the result in a test of significance 
might confirm the null hypothesis, at least when considered along with other 
tests and evidence. For example: 


. .. if the observations are such that with reasonable probability they might have 
arisen on the hypothesis under test, this hypothesis, though not proved, has at least 
so far been confirmed, and, pending further and more stringent observations, may 
be accepted. [1951, pp. 36—7]' 


.. it is a fallacy, so well known as to be a standard example, to conclude from a 
test of significance that the null hypothesis is thereby established; at most it may 
be said to be confirmed or strengthened. [1955, p. 73] 


A test of significance contains no criterion for ‘accepting’ a hypothesis. According 
to circumstances it may or may not influence its acceptability. [1956, p. 45] 


Here Fisher suggested that the result in a test of significance might support 
the null hypothesis, although he did not explain how. Meanwhile, he 
retained the notion that there are just two classes of results in a test of 
significance, results which entail ‘significant’ evidence against the null 
hypothesis, and results which do not, i.e. there is not (explicitly) a class of 


' Fisher may have been influenced in this passage by Berkson. In regard to levela of significance 
greater than about 0.3, Berkson [1942, p. 330] wrote: ‘Since by definition such P’s will occur 
frequently in the case in which the null hypothesis is true, the finding of one is to be taken 
as prima facie evidence in favor of the null hypothesis’, There is some resemblance between 
Fisher’s passage and this passage from Berkson. 


488 D.J. Johnstone 


results which support the null hypothesis. Hence, it seems that Fisher’s 
position on support for the null hypothesis is somewhat confused, or at 
least obscure. 


(vi) The Logic for Repetition 


The object in a test of significance following Fisher is to discredit the 
null hypothesis. Fisher explained that a test of significance which yields a 
‘significant’ level of significance, t.e. a level of significance less than (say) 
5%, is a promising start, but not enough to discredit the null hypothesis 
convincingly. After all, the result in a test of significance might be due to 
sampling error (aberration), or bias in the experiment, or an error in the 
underlying model. Thus, the result in any single test of significance is 
strictly provisional, and subject always .to additional evidence. This was 
clearly Fisher’s position. For example: 


. .. we may be able validly to apply a test of significance to discredit a hypothesis 
the expectations from which are widely at variance with ascertained fact. If we use 
the term rejection for our attitude to such a hypothesis, it should be clearly under- 
stood that no irreversible decision has been taken; that, as rational beings, we are 
prepared to be convinced by future evidence that appearances were deceptive, and 
that in fact a very remarkable and exceptional coincidence had taken place. [1956, 
P. 37] 


Fisher maintained that the null hypothesis is discredited convincingly only 
with repeated tests of significance which yield repeatedly significant levels 
of significance: 


In relation to a test of significance, we may say that a phenomenon is experimentally 
demonstrable when we know how to conduct an experiment which will rarely fail 
to give us a statistically significant result. [1935a, p. 14] 


Fisher’s logic for repetition is that if the null hypothesis is true, then the 
probability, (given by the probability.) that a series of (independent) tests 
will yield repeatedly significant levels of significance is extremely small, 
indeed negligible. Hence, on the disjunction that either an extremely 
improbable, series of events has occurred or the null hypothesis is false, 
the null hypothesis is discredited. Moreover, given sufficient repetition, the 
null hypothesis is discredited convincingly. Fisher wrote: 


. . no one doubts, in practice, that the probability of being led to an erroneous 
conclusion by the chances of sampling only, can, by repetition . . . of the sample, 
be made so small that the reality of the difference must be regarded as convincingly 
demonstrated. [1936, p. 58] 


(vii) The Critical Level « 


Fisher interpreted the critical level of significance a as a relative frequency 
(probability,). He explained that if we decide to ‘reject’ the null hypothesis 


Tests of Significance Following R. A. Fisher 489 


whenever the level of significance P is equal to or better than (for example) 
a = 1%, then we will be mistaken in the long run in not more than 1% of 
cases: 


A man who ‘rejects’ a hypothesis provisionally, as a matter of habitual practice, 
when the significance is at the 1% level or higher, will certainly be mistaken in not 
more than 1% of such decisions. For when the hypothesis is correct he will be 
mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken 
in rejection. [1956, pp. 44-5] 


Fisher used the critical level of significance « as a benchmark on which to 
decide whether the best bet (literally) is to abandon efforts to exclude the 
null hypothesis, or to proceed with additional tests. His position is that we 
should make the critical level of significance very low, for example a = 5%, 
in order to limit the relative frequency with which we proceed with null 
hypotheses which are in fact true: 


It is usual and convenient for experimenters to take 5 per cent. as a standard level 
of significance, in the sense that they are prepared to ignore all results which fail to 
reach this standard, and, by this means, to eliminate from further discussion the 
greater part of the fluctuations which chance causes have introduced into their 
experimental results. [1935a, p. 13] 


The value [of the statistic x] for which P = .05, or 1 in 20, is 1.96 or nearly 2; it is 
convenient to take this point as a limit in judging whether a deviation is to be 
considered significant or not... . Using this criterion we should be led to follow 
up a false indication only once in 22 trials .. . [1925, p. 44] 


Personally, the writer prefers to set a low standard of significance at the 5 per cent. 
point, and ignore entirely all results which fail to reach this level. [1926, p. 504] 


Fisher used the critical level of significance «& as a benchmark for decision, 
but certainly not for inference. He did not mean to give the impression that 
the critical level « partitions the sample space between samples which entail 
evidence against the null hypothesis, and samples which do not. It was 
simply his expedient, a matter of policy, to abandon the null hypothesis if 
the level of significance P exceeds the critical level æ. This was a rule for 
decision, not for inference. For inference, levels of significance around « 
represent essentially the same result: 


In practice, a significance level of 5% or less is usually considered necessary before 
one can claim to have evidence against the hypothesis. Of course, this convention 
is arbitrary, and should not be taken too seriously. The interpretation of 4.9% and 
5.1% significance levels will be much,.the same, even though these values lie on 
opposite sides of the magic 5% level. [Kalbfleisch (1979) p. 136]. 


Values of P should be quoted approximately and rigid borderlines avoided. Thus 
in terms of the interpretation of data it would be absurd to draw a rigid borderline 
between P = 0.051, not significant at the 0.05 level, and P = 0.049, significant at 
the 0.05 level, even though in decision-making contexts with strictly limited data 
such borderlines do have to be adopted. [Cox (1982) p. 328] 


490 D.J. Johnstone 
(viii) Fisher's Dualism 
Fisher wrote at length on the interpretation of levels of significance, but 
not at all clearly. Indeed, it has been suggested that Fisher’s interpretation 


is confused or at least confusing. Consider, for example, the passages from 
Kempthorne (a statistician) and Rosenkrantz (a philosopher) quoted below: 


Part of the obscurity of Fisher with regard to tests of significance arises in connection 
with the meaning of a significance level, which is in some cases a frequency in a 
population of repetitions, but is not in other cases. [Kempthorne (1972) p. 179] 


One can force the problem into a decision theoretic mold, requiring that the 
experiment issue in a decision to accept or a decision to reject the null hypothesis, 
a rule of rejection being laid down in advance. Such a rule . . . is chosen on the 
basis of the error probabilities to which it leads ... One could adopt a very different 
attitude, however, and regard the probability of equalling or exceeding the [critical 
value of the test statistic] as a measure of the evidence against the null hypothesis. 
Fisher’s own position on the matter is ambivalent to say the least [Rosenkrantz 


(1977) pp. 178-9]. 


There is a fundamental dualism in Fisher’s theory, which gives the 
impression of confusion or ambivalence. Fisher was concerned essentially 
with inference, yet he was concerned as well with decision. Specifically, he 
was concerned with the decision of whether to abandon the null hypothesis 
or to proceed with additional tests. He was happy to admit relative frequency 
(probability,) as the basis for decision‘, at least for repetitive decision, but 
not as the basis for inference, not without knowledge of the reference set. 
Indeed, he said as much, in typically cryptic style: 

The logical consequences of a statement of Mathematical Probability are clear and 
well-known. They allow the calculation of long-run policies in laying bets, in the 
hypothetical and restricted field of games of chance, played fairly with perfect ap- 


paratus; but these logical consequences may . . . be in themselves of little importance 
to the bearing of observable facts on the acceptability of possible hypotheses. 


(1956, p. 47] 
... the mathematical concept of probability is, in cases in which fiducial probability 


[t.e. probability,, measured as probability] is not available, inadequate to express 
our mental confidence or diffidence in making such [inductive] inferences. [1925, 


p. 10] 


4 CONDITIONALITY 


Fisher emphasized that the probability p(Z) of the event E measured over 
the reference set R is a legitimate probability, in the single case X (where 
XER) if and only if the reference set R is free from any recognizable relevant 
(with respect to E-ness) subset {. .X. .}. The principle that the reference 
set R must be free from any recognizable relevant subset {. .X. .} has 


' Note that Fisher praised the use of frequency-based tests (he called these ‘acceptance 
procedures’) in the recurrent decisions in quality control, such as the decision to accept or 
reject a consignment of components, e.g. [1956, p. 80}. 


Tests of Significance Following R. A. Fisher 491 


of course the corollary that if R includes a recognizable relevant subset 
{. .X. .}, then that subset is the proper reference set on which to measure 
the probability p(£) in the single case X. Fisher wrote: 


Manifestly, if the subject [single case] did belong to such a recognizable subset 
[a recognizable relevant subset] the latter would replace the original set, as the 
appropriate basis for the probability statement. [1962, p. 17] 


Thus, if the reference set R includes a recognizable relevant subset {. .X. .}, 
say R, then R must give way to R, Or, in terms of “conditionality”, R 
must be conditioned on the property which defines that subset R,, i.e. 
the property which is peculiar to the particular cases which comprise R,. 
Conditionality is important if there is a statistic which is ‘ancillary’. The 
statistic A is ‘ancillary’ with respect to thé parameter 8 if the distribution 
p(A|6) is independent of the parameter 8 [Basu (1959) p. 247]. If the statistic 
A is ancillary, then the value A(X) defines a recognizable relevant (with 
respect to P) subset {. .X. .} within the sample space S, i.e. within the set 
of all the samples which would arise given repeated sampling. Thus, in 
principle, the reference set R must be conditioned on the value A(X). 
Fisher pursued this principle bitterly. He condemned Neyman and Pearson 
on the basis that they promoted tests with no concern for ancillary statistics. 
Consider, for example, his sentiments concerning 2 x 2 contingency tests, 
in which the margins are ancillary: 


A whole series of erroneous tests of significance were incorporated into statistical 
teaching [by Neyman and Pearson], and . . . ensure that many young men now 
entering employment in research, or industry, or administration, have been partly 
incapacitated by the crooked reasoning with which they have been indoctrinated. 
Familiar examples include the test of proportionality in a two by two table, perhaps 
the most frequently used of all tests of significance, in which the recognizable sub- 
set of possibilities having the same marginal totals as the sample observed, has been 
more than once over-looked ... It is . . . the fact that the marginal totals in 
the sample are known or recognizable, which defines the sub-set of possibilities 
appropriate and available for the test of significance. [1960, pp. 6~7] 


Fisher’s principle of conditionality is not always successful. Basu [1964, 
pp. 11—13] has identified tests in which there are two or more ancillary 
statistics which together are not ancillary. In tests such as these, the ref- 
erence set might be conditioned on just one ancillary statistic, perhaps 
arbitrarily, or not conditioned at all. Either way, recognizable relevant 
subsets remain. This is a fundamental shortcoming in Fisher’s program for 
tests of significance. 

Fisher explained in his examples that if the statistic A is ancillary with 
respect to the parameter 0, then the value A(X) measures the precision of 
the information in the sample X concerning the true value of the parameter 
0, e.g. [1935b, p. 48]; [1955, p. 72]. He said that if the reference set R is 
conditioned on the value A(X), then R will include samples all of the same 
precision. It has been suggested that these remarks constitute Fisher’s 
rationale for conditioning. For example: 


492 D. J. Johnstone 


. .. the motivation for conditioning . . . was stated by Fisher [1935b, p. 48] as 
follows: “Ancillary statistics are only useful when different samples of the same size 
can supply different amounts of information, and serve to distinguish those which 
supply more from those which supply less.” [Lehmann (1981) p. 339] 


In problems of inference one must take into account the informativeness of the 
outcome actually obtained. In a test of significance, the data should be compared 
with, and hence the reference set should consist of, outcomes having approximately 
the same informativeness or precision as the one observed. [Kalbfleisch and Sprott 
(1976) p. 267] 

The observed value of the ancillary statistic indicates the informativeness of the 
data actually obtained. It is therefore appropriate to base inferences about 8 on the 
conditional distribution of [the test statistic] T given the observed value of A. 
[Kalbfleisch (1979) p. 183] 


Statements such as these obscure Fisher’s logic. Fisher insisted that the 
level of significance P on the sample X is a legitimate probability, in the 
single case if and only if the reference set R is free from any recognizable 
relevant subset {. .X. .}. He conditioned the reference set R on the value 
A(X) of the ancillary statistic A so that this might be the case. 


5 ON ‘REPEATED SAMPLING FROM THE SAME POPULATION’ 


Neyman and Pearson introduced the notion that the critical level of sig- 
nificance g represents the relative frequency f(P < alh,) in ‘repeated sampling 
from the same population’. Fisher rejected this notion, with two distinct 
arguments. Let me consider these in turn: 


(i) Fisher argued that if the reference set R is conditioned on the value 
A(X) of the ancillary statistic A, then the population (reference set) R is not 
open to repeated sampling, because the value of the conditioning variable A 
cannot be fixed from one sample to the next. Consider, for example, a t- 
test on A,:8 = B,, given the estimate b of the coefficient 8, in a simple 
regression of x on y. The null distribution of the test statistic t = (b—B,) 

A/S, with n—2 degrees of freedom, depends on the ancillary statistic 
A = E(x— 2}. Fisher wrote: 


I do not believe that anyone doubts the validity of this simple test. It does, however, 
violate the rule of determining levels of significance by frequencies of occurrence 
of the proposed events in repeated samples from the same population. For if a 
succession of sets of N pairs of observations (x, y) were taken from the same 
population, the value of A would not be the same for each set. [1955, p. 71] 


If we must think in terms of random sampling, it is only that selection of random 
samples which agree exactly with our own in respect to the value of A = E(x- 2}, 
that is relevant... . Such a selection might be quite inaccessible . . . [1956, p. 89] 


This argument is straightforward. The population (reference set) sampled 
depends on the value A(X), and A(X) changes from sample to sample. 


Tests of Significance Following R. A. Fisher 493 


Hence, the notion of repeated sampling from the very same population is 
strictly not applicable. 


(ii) Fisher argued that the level of significance P given the composite null 
hypothesis k, = (h, or h, or . . .) is the highest level of significance P, over 
the class of disjuncts k c h,. Thus, the relative frequency f(P < alh.) will 
be equal to « only if the disjunct h which generates the highest level of 
significance P, is the true disjunct. If some other disjunct A is true, then 
the relative frequency {(P < «| h.) will be less than a. Indeed, if there is any 
great ‘distance’ between the disjunct A which generates the highest level of 
significance P, and the disjunct which is true, then the relative frequency 
with which the level of significance P is less than or equal to the critical 
level g will be greatly less than a. Hence, Fisher concluded that the evidence 
against the null hypothesis (which is measured by the level of significance 
P) can not be equated simply with the relative frequency with which 
evidence of that force (P-value) is attained: 


Composite hypotheses in general, however, contain another reason for ignoring 
the assumption that the frequency of rejection should be equated to the level of 
significance; which reason flows from the very fact that they are composite, t.e. that 
two or more distinct possibilities are to be rejected, each on sufficiently strong 
evidence. It may be that samples of the kinds available do not so easily dismiss the 
whole range of hypotheses to be tested even at a moderate level of significance. ... 
Obvious as it might seem, it is evidently necessary to point out that it is no remedy 
to construct a test of significance with a firm intention that the hypothesis shall be 
rejected when true in a fixed proportion of trials. For this may well be math- 
ematically impossible, for the whole range of cases; and consequently, what is of 
much greater importance, a test which is made to conform in one case may go 
widely astray in others. .. . In fact, as a matter of principle, the infrequency with 
which, in particular circumstances, decisive evidence is obtained, should not be 
confused with the force, or cogency, of such evidence. It is the highest frequency, 
within the range of the class of hypotheses tested, that is logically relevant. [1956, 


pp. 93-6] 


This argument seems clear enough, however there is some dissent or con- 
fusion. Indeed, Kempthorne [1979, pp. 205~—7] rejects the statement from 
Fisher quoted above out of hand: 


I wish to suggest that this is a totally confused and confusing statement. How are 
we to disentangle the idea of ‘level of significance’ from frequency in repetitions? 
One might suppose that Fisher gives us real guidance on this, but, in my opinion, 
he does not. . . . Fisher, in spite of all his polemics against the use of the idea of 
repetition of sampling, uses the idea totally, completely and intimately. ... We can 
certainly examine the frequency with which a sample point falls in a ‘significance’ 
[critical] region. But Fisher does not tell us how we quantify ‘‘the force, or cogency, 
of such evidence.” Without this, I think we have no recourse but to ignore his 
writings (in this area, of course). 


Kempthorne is frustrated by Fisher’s writing, which is obscure and pol- 
emical, however he does not do Fisher justice. Fisher illustrated his argu- 
ment with a typically enchanting example. The null hypothesis in this 
example is the hypothesis that ‘the proportion of the two red suits [in a set of 


494 D.F. Johnstone 


cards] do not both exceed 25%’. This hypothesis reduces to the disjunction 
h, = (A, or hì), where h, is the hypothesis that the proportion of hearts is 
less than or equal to 25%, and h, is the hypothesis that the proportion of 
diamonds is less than or equal to 25%. Fisher maintained that we can reject 
the disjunction h, = (A, or h,) at the highest level of significance P, over the 
class of disjuncts h c h,. Thus, we can reject h, = (h, or h) at the a level of 
significance if we can reject both the disjunct h, and the disjunct h; at the a 
level, t.e. the critical region of size a for the null hypothesis h, = (A, or ha) 
is the intersection Sı N S, where S, is the critical region of size « for the 
hypothesis h,, and S; is the critical region of size « for the hypothesis Ay. 
Kempthorne identified this region correctly: 


The outcome, according to the Fisher example, is that using ‘a reasonable level of 
significance’ for each component [A] of our composite null hypothesis, we shall 
decide that there is reasonable strength of evidence against the composite null 
hypothesis if the sample point lies in S; ^ S,, where S; is determined from h,, [my 
hı] and S, is determined from h, [my hz]. [1979, p. 207] 


Kempthorne identified the critical region in Fisher’s test correctly, and yet 
he interpreted Fisher’s null hypothesis as a conjunction, viz. ‘h, and h,,’ 
(h; and hz). This is clearly a mistake, for if the null hypothesis in Fisher’s 
example is the conjuction (h, and h), then the logical critical region is 
not the intersection Sı N S}, but the union S, O S,. Thus, if we accept 
Kempthorne’s analysis, there is an elementary error of logic in Fisher’s 
example, an error which Kempthorne ignores. There is of course no such 
error; the null hypothesis in Fisher’s example is not a conjunction, but a 
disjunction, viz. (hk; or h,). Now consider the test, introduced by 
Kempthorne, in which the disjuncts h exhaust the infinite universal set U. 
The critical region in a test such as this is the intersection S; A S3... U Sq, 
which is clearly null (empty). Thus, on Fisher’s logic, it is not possible to 
reject a composite hypothesis in which the disjuncts exhaust the universal 
set U. Kempthorne finds this absurd. But again he is wrong. A composite 
hypothesis in which the ‘component’ hypotheses exhaust the universal set 
U cannot be anything but true. This sort of hypothesis reduces to the 
disjunction (A; or hor . . .) where {h} = U, which is the classical tautology. 


6 SEIDENFELD’S ARGUMENT 


Seidenfeld is, I think, the most helpful single source on Fisher.' I have 
gained much from his work, especially from his superb book [1979] sub- 
titled Learning from R. A. Fisher. I agree with Seidenfeld’s account of tests 
of significance following Fisher, but for one important matter. Seidenfeld 
claims that Fisher ruled out any ‘repeated sampling’ interpretation of levels 


' Henry Kyburg is also a great help. Kyburg’s theory is very much in the spirit of Fisher’s 
theory, but much more developed, eg. Kyburg [1974]. 


Tests of Significance Following R. A. Fisher 495 


of significance P if the null hypothesis is composite, or if the reference set 
R is conditioned on an ancillary statistic 4. He wrote: 


Particular attention is paid to Fisher’s remarks with respect to tests of homogeneity 
for the 2 x 2 contingency table. The emphasis is on avoiding the repeated sampling 
interpretation of the significance test. It is shown that generally no such interpret- 
ation is available when the test is of a composite hypothesis, or when the test 
employs an ancillary statistic (for conditioning) . . . [1979, p. 70] 


Fisherian significance tests, in general, do not allow interpreting the P-level as 
a relative frequency in a (hypothetical) sequence of repeated trials of the same 
experiment. [1984] 


I think that Seidenfeld is mistaken. Let me begin with tests in which the 
null hypothesis is composite. Fisher explained that if the null hypothesis 
is composite, then we can not equate the level of significance P exactly with 
the relative frequency in ‘repeated sampling’ of samples with that P-level, 
not unless the test is exact. However, we can say that the level of significance 
P is equal to or greater than the relative frequency of samples with that 
P-level. Hence, it seems wrong to suggest that no ‘repeated sampling’ 
interpretation is available. Now consider tests in which the reference'set R 
is conditioned on the value A(X) of an ancillary statistic A. Fisher cited 
tests such as these to disparage the Neyman-Pearson shibboleth ‘repeated 
sampling from the same population’. He argued that the reference set 
(population) R is not open to repeated sampling because R changes from 
sample to sample depending on the value A(X). But the ‘repeated sampling’ 
interpretation of the level of significance P requires merely that the ref- 
erence set R is the same set stochastically (with respect to P) from one 
sample to the next, i.e. the level of significance P must have the same 
distribution over the reference set R no matter what the sample. Now we 
know that this is so. By definition, the level of significance P is distributed 
over the reference set R uniformly between o and 1, whatever the sample 
[Cox and Hinkley (1974) p. 66]. Therefore, not withstanding Fisher’s 
strictures, a ‘repeated sampling’ interpretation is available, despite that the 
value A(X) can not be fixed. Indeed, Fisher relied on a ‘repeated sampling’ 
interpretation himself, not for inference, but for the decision to abandon 
the null hypothesis or to proceed with further tests. He did not suggest 
ever that there is (or must be) another logic for this decision if the reference 
set is conditioned on an ancillary statistic. 


7 SAMPLES ALIKE IN ALL ‘RELEVANT’ RESPECTS 


Fisher declared repeatedly that the samples which comprise the reference 
set R must resemble the sample X in all ‘relevant’ respects. For example: 


In tests of significance we define a class of events (samples) having a certain specified 
frequency (5 per cent, 1 per cent, etc., according to the level of significance chosen) 
among a population A [my R] of samples to which we regard the sample observed, 
to which the test of significance is to be applied, to belong. This population must 


496 D. J. Johnstone 


resemble the sample observed in all [recognizable] relevant respects; indeed the 
argument is not impeded if it be taken to resemble the sample in all observable 


respects. [1945, p. 130] 

. in testing the significance with a unique sample, we should compare it only 
with other possibilities [samples] in all [recognizable] relevant respects like that 
observed .. . [1955, p. 72] 


The principle that the samples which comprise the reference set R must 
resemble the sample X in all recognizable ‘relevant’ respects is not at all well 
known, or understood. Consider, for example, Kempthorne’s conclusion 
concerning the first of the passages quoted above: 


The last sentence, particularly, leads me to the view that Fisher was talking on a 
plane barely understandable to the rest of humanity. [1966, p. 13] 


I suspect that Kempthorne is confused by Fisher’s word ‘relevant’ used to 
describe respects of the sample X. A ‘respect’ r = r( X) ofthe sample X is 
simply a property of that sample, such that it has size n, or that it was drawn 
just before the scientist had to catch the train. It is clear I think that the 
respect r of the sample X is a ‘relevant’ respect in Fisher’s sense if r defines 
a relevant (peculiarly distributed) subset, i.e. if r is peculiar to samples 
which comprise a relevant subset. Thus, Fisher’s principle that the ref- 
erence set R must resemble the sample X in all ‘relevant’ respects is merely 
his principle that R must be free from any recognizable relevant subset. 


8 THE ‘RANDOM’ SAMPLE 


Fisher insisted in theory that the sample X in a test of significance must be 
a random sample in the ordinary frequentist sense. He explained that the 
mathematics in a test of significance presuppose that the sample X is drawn 
at random: 


The theory of estimation presupposes a process of random sampling. All our 
conclusions within that theory rest on this basis, without it our tests of significance 
would be worthless. In order to justify the conclusions of the theory of estimation, 
and the tests of significance as applied to counts or measurements arising in the real 
world, it is logically necessary that they too must be the results of a random process. 


[1947, Pp. 435—6] 
Randomization is a device designed to ensure that the laws of chance [the probability 


calculus] used in testing the significance shall be validly operative in the physical 
conduct of the experiment. [1951, p. 54] 


Fisher insisted in theory that X must be a random sample in the frequentist 
sense, however he was not so convincing in practice. He made little or 
nothing in practice of the principle that the units (subjects) used in an 
experiment must be drawn at random from the population in question. It 
has been stressed in conventional textbooks that the conditions which 
operate in an experiment, including the units used in the experiment, must 


Tests of Significance Following R. A. Fisher 497 


be generated at random, but this reflects I think the influence of Neyman 
more than Fisher. Fisher was not as rigorous as Neyman. He was often 
content to let fate decide the conditions in the experiment: 


In controlled experimentation it has been found not difficult to introduce explicit 
and objective randomization in such a way that the tests of significance are demon- 
strably correct. In other cases we must still act in the faith that nature has done the 
randomization for us, as for example, when we conduct a series of variety trials 
over a period of five years, and draw our conclusions as if the weather experienced 
in these years were really a random sample of five out of an infinite population of 
seasons determined by the climate prevailing at the site of experimentation. [1947, 


p. 436] 


Not content with the ordinary frequentist notion of the random sample, 
Fisher developed his own concept, which he employed jointly with the 
ordinary frequentist concept. His own concept is that the sample X is a 
‘random’ sample from the reference set R (with respect to E-ness) if the 
reference set R is free from any recognizable relevant (with respect to E- 
ness) subset {. .X. .}, e.g. [1956, pp. 113-4]; [1958, pp. 263-4]). Consider, 
for example, the result in a single throw of a die. Fisher suggested that the 
result in such a trial is a ‘random’ sample from the reference set of all 
possible throws if the reference set of all possible throws is free from any 
recognizable relevant subset: 


Before the limiting ratio of the whole set can be accepted as applicable to a particular 
throw, a second condition must be satisfied, namely that before the die is cast no 
such [relevant] subset can be recognized. . . . On this condition we may think of 
[the result in] a particular throw, or of a succession of throws, as a random sample ` 
from the aggregate, which is in this sense subjectively homogeneous and without 
recognizable stratification. [1956, p. 35] 


Randomness on the frequentist account is an empirical state (of the sampling 
mechanism). Hence, it is impossible to be sure whether a sample is random 
on the frequentist account. But on Fisher’s account, randomness is an 
epistemic state, i.e. a state of knowledge. A sample is random on Fisher’s 
account depending simply on our knowledge of relevant subsets. Note of 
course that neither account is intended to embrace the other. 

Fisher appreciated that the sample X in a test of significance should 
ideally be a random sample in the ordinary frequentist sense [cf. Kyburg 
(1976) pp. 365-75], but ultimately he seemed more concerned that the 
sample .X is a random sample (from the reference set R) in his own epts- 
temological sense. His position is that if the sample X is not a random 
sample in his own sense, i.e. if the reference set R is not free from any 
recognizable relevant subset {. .X. .}, then the level of significance P is not 
a legitirnate probability, in the single case. 


Department of Accounting, 
University of Sydney 


498 D.J. Johnstone 


REFERENCES 


Basu, D. [1959]: "The Family of Ancillary Statistics’, Sankhya, 21, pp. 247-56. 

Basu, D. [1964]: ‘Recovery of Ancillary Information’, Sankhya, 26, pp. 3-16. 

BERKSON, J. [1942]: “Tests of Significance Considered as Evidence’, Journal of the American 
Statistical Association, 37, pp. 325-35. 

Brown, L. [1967]: “The Conditional Level of Student’s t Test’, Annals of Mathematical 
Statistics, 38, pp. 1068—71. 

BUEHLER, R. J. [1959]: ‘Some Validity Criteria for Statistical Inferences’, Annals of Math- 
ematical Statistics, 30, pp. 845-63. 

CARNAP, R. [1950]: Logical Foundations of Probability. Second Edition 1962. University of 
Chicago Press. 

Cox, D. R. [1982]: ‘Statistical Significance Tests’, Brisish Journal of Clinical Pharmacology, 
14, PP. 325-31. 

Cox, D. R. and HINKLEY, D. V. [1974]: Theoretical Statistics. Chapman and Hall.. 

FISHER, R. A. [1925]: Stattstical Methods for Research Workers. Fourteenth Edition 1970. 
Oliver and Boyd. 

FISHER, R. A. [1926]: “The Arrangement of Field Experiments’, Journal of the Ministry of 
Agriculture Great Britain, 33, pp. 503-13. 

FISHER, R. A. [1935a]: The Design of Experiments. Eighth Edition 1966. Oliver and Boyd. 

FISHER, R. A. [1935b]: “The Logic of Inductive Inference’, Journal of the Royal Statistical 
Society, 98, pp. 39-82. 

FISHER, R. A. [1935c]: ‘Statistical Tests’, Nature, 136, p. 474. 

FISHER, R. A. [1936]: ‘‘“The Co-efficient of Racial Likeness” and the Future of Craniometry’, 
Journal of the Royal Anthropological Institute, 66, pp. 57-63. 

FISHER, R. A. [1945]: "The Logical Inversion of the Notion of the Random Variable’, Sankhya, 
7, PP. 129-32. 

FISHER, R. A. [1947]: ‘Development of the Theory of Experimental Design’, Proceedings of 
the International Statistical Conferences, 3, pp. 434-9. 

FISHER, R. A. [1951]: ‘Statistics’ in Scientific Thought in the Twentieth Century (Heath, A., 
ed.) pp. 31-55. Watts. 

FISHER, R. A. [1955]: ‘Statistical Methods and Scientific Induction’, Journal of the Royal 
Statistical Society, B, 17, pp. 69-78. 

FISHER, R. A. [1956]: Statistical Methods and Scientific Inference. Third Edition 1973. 
Hafner. 

FISHER, R. A. [1958]: “The Nature of Probability’, Centennial Review, 2, pp. 261-74. 

FISHER, R. A. [1959]: ‘Mathematical Probability in the Natural Sciences’, Technometrics, 1, 
Pp. 21-9. 

FISHER, R. A. [1960]: ‘Scientific Thought and the Refinement of Human Reasoning’, Journal 
of the Operations Research Society of Japan, 3, pp. 1+10. 

FISHER, R. A. [1962]: “The Place of the Design of Experiments in the Logic of Scientific 
Inference’, Colloques Internationaux du Centre de la Recherche Scientifique (Paris), 110, 
PP. 13-9. 

KALBFLEISCH, J. G. [1979]: Probability and Statistical Inference. Volume 2. Springer-Verlag. 

KALBFLEISCH, J. G. and Sprott, D. A. [1976]: ‘On Tests of Significance’, in Foundations of 
Probability Theory, Statistical Inference, and Statistical Theories of Science. Volume 2. 
(Harper, W. L. and Hooker, C. A., eds.) pp. 259-72. D. Reidel. 

KEMPTHORNE, O. [1966]: ‘Some Aspects of Experimental Inference’ Journal of the American 
Statistical Association, 61, pp. 11-34. 

KEMPTHORNE, O. [1972]: “Theories of Inference and Data Analysis’ in Statistical Papers in 
Honor of George W. Snedecor (Bancroft, T. A. ed.) pp. 167-91. Iowa State University 
Press. 

KEMPTHORNE, O. [1979]: ‘In Dispraise of the Exact Test: Reactions’ Journal of Statistical 
Planning and Inference, 3, pp. 199-213. 

KEMPTHORNE, O. and DOERFLER, T. E. [1969]: ‘The Behaviour of Some Significance Tests 
under Experimental Randomization’ Biometrika, 36, pp. 231-48. 

KYBURG, H. E. {1974]: The Logical Foundations of Statistical Inference. D. Reidel. 

KYBURG, H. E. [1976]: ‘Chance’ Journal of Philosophical Logic, 5, pp. 355-93. 

LEHMANN, E. L. [1981]: ‘An Interpretation of Completeness and Basu’s Theorem’ Journal of 
the American Statistical Association, 76, pp. 335-40. 


Tests of Significance Following R. A. Fisher 499 


LINDLEY, D. V. [1971]: Introduction to Probability and Statistics from a Bayesian Viewpoint. 
Part 2. Cambridge University Press. 

PIERCE, D. A. [1973]: ‘On Some Difficulties in a Frequency Theory of Inference’, The Annals 
of Statistics, 1, pp. 241-50. 

ROBINSON, G. K. [1976]: ‘Properties of Student’s t and of the Behrens-Fisher Solution to the 
Two Means Problem’, The Annals of Statistics, 4, pp. 963-71. 

ROSENKRANTZ, R. D. [1977]: Inference, Method and Decision: Toward a Bayesian Philosophy 
of Science. D. Reidel. 

SEIDENFELD, T. [1979]: Philosophical Problems of Statistical Inference: Learning from R. A. 
Fisher, D. Reidel. 

SEIDENFELD, T. [1984]: Private Communication. 


Brit. J. Phil. Sct. 38 (1987), 501-513 Printed in Great Britain : 501 


Rationality and Predictability in 
Economics 
by CRISTINA BICCHIERI 


Introduction 

Epistemic Rationality and Predictability 
Expectations as a Datum 

Rational Expectations 

Conclusions 


na bh WN E 


I INTRODUCTION 


The recent literature on the philosophy of economics focuses on two dif- 
ferent, though related, problems. The first is whether economic theories are 
capable of generating interesting predictions, that is, testable results that 
would confirm (or falsify) their general statements. This is a particularly 
relevant problem in view of the aim of economic theories of generating 
policy prescriptions. The second problem has to do with the so-called 
‘realism’ of economic assumptions about individual behaviour. Here the 
epistemic status of these assumptions is called into question. 

Little attention however has been devoted to the question whether the 
common economic assumptions about individual behaviour are enough to 
allow for prediction. Without questioning the ‘positive’ goals economics 
sets itself, or the ‘realism’ of assumptions made in economic theories, I will 
enquire into the conditions theories have to satisfy in order to be predictive. 
First, I will show that—in order to be able to predict—economists have to 
assume not only practical, but also epistemic rationality on the part of 
agents. Second, I will consider the different interpretations of epistemic 
rationality proposed by economists. Third, I will show that even the best 
theory thus far proposed, that of ‘rational expectations’, is inconsistent with 
the economist’s goal of offering policy prescriptions. Since economists want 
their theories to be predictive in order to be able to analyse the consequences 
of policy interventions, this inconsistency constitutes a serious problem. 


2 EPISTEMIC RATIONALITY AND PREDICTABILITY 


The common mode of explanation in economics is methodological indi- 
vidualism. Macroeconomic phenomena, for example, are explained as 
the collective result of individual behaviour, where the motivating forces 
are taken as exogenously given, together with endowments and technology. 


Recetved December 1986 


502 Cristina Bicchieri 

In order to be able to arrive at generalisations, economists have to assume 
that individual behaviour follows some regular pattern, that all actions 
possess a common structure, however different the individual motives 
behind them might be. Generality is attained by modelling individuals as 
rational decision-makers. 

What is commonly meant by ‘rationality’ is practical rationality, that is, 
the rationality of action, given the agent’s belief that that action will lead 
to desirable consequences. Thus, if agent x desires g and believes that action 
p will enable him to get q, he is practically rational in choosing p. How he 
came to believe that p leads to q is not relevant, nor is it relevant that his 
belief might be unfounded or patently false. For practical rationality to 
hold, one does not need to make any assumption about belief formation. 
The agent might have come to form his belief by consulting an astrologer, 
tossing a coin, or by investing in market research; what matters is the 
relation between belief and action. 

Imputing epistemic rationality to the agent, on the contrary, implies that 
we judge his belief that p ‘leads to’ q to be grounded on a correct assessment 
of the available evidence.' Rationality here has to do with the recognition 
of the correctness of the belief, given the evidence at the agent’s disposal. 
It is useful to introduce at this point a distinction between three different 
definitions of ‘rational belief’ which I will use in discussing imputations of 
epistemic rationality to economic agents: 


(i) Weak subjectively rational belief: the belief is logically coherent and 
consistent with other beliefs expressed by the agent. For instance, it is 
required that probabilistic beliefs satisfy the axioms of probability calculus. 

(ii) Strong subjectively rational belief: the agent makes use of all the 
available information (which includes data, ways of processing data, rules 
of inference, and so on) and does not make systematic mistakes. 

(iii) Objectively rational belief: the belief turns out to be factually correct. 
An imputation of objectively rational belief thus asserts a correspondence 
between the agent’s belief and the world. 


This trichotomy depends upon a first distinction between a procedural 
definition of rationality of belief and an end state one. (i) and (ii) correspond 
to the procedural definition: if a belief is formed according to a rational 
procedure (formal and/or substantive), then it is rational. (iii) corresponds 
to the end state definition: however the belief is acquired, it happens to be 
factually correct. Hence, it is ‘ objectively’ rational. It must be noticed that 
(ii) does not imply (iii), nor does (iii) imply (ii). That is, even if a rational 
procedure has been followed, and hence the belief is subjectively rational, 
it might turn out to be completely false. Conversely, a belief might be 


1 ‘Correct assessment’ can be attributed different meanings, ranging from the assignment of 
probabilities to possible states to the use of particular economic models. I discuss this point 
in section 3. 


Rationality and Predictability in Economics 503 


confirmed which is the result of a non-rational inference, or no inference 
at all. 

Now, it seems straightforward that, in order to be able to predict people’s 
actions, one has to impute to them practical rationality. That is, given the 
agent’s belief that p is the required action, one must be sure the agent will 
act in conformity with his belief. However, prediction is possible only in 
so far as one knows agents’ beliefs. The difficulty with this account is that 
we lack any independent way of knowing what the agent’s beliefs are. 

In traditional neoclassical models this is not a problem, since here the 
agent has perfect knowledge of the certain outcomes of the alternatives open 
to him, can identify the best of them, and chooses it on that account. This 
enables one to predict what such agents will do. One only needs to know 
the agent’s preferences in order to determine his choice. There is no need 
to infer the agent’s belief from what we think it would be rational for him 
to believe in such circumstances. 

Perfect knowledge, however, is not the most common—nor the most 
interesting—case in economics. A common case in which practical ration- 
ality is too ‘thin’ a requirement is the case of choice under uncertainty. If 
a person must choose among actions whose outcomes are uncertain the 
action chosen, even given practical rationality, will crucially depend upon 
his beliefs about the probabilities of the various possible outcomes. The 
economist cannot hope to predict the agent’s choice without being able to 
predict what beliefs the agent will form, given the information available to 
him, and this is possible only by imputing epistemic rationality to the agent. 
Epistemic rationality, in this case, would mean that we expect the agent to 
assign a set of probabilities to the possible states of nature, and that these 
probabilities satisfy the axioms of probability calculus. 

Characterising epistemic rationality is considerably more difficult in the 
case of a social environment. The subject matter of economics, after all, is 
not the behaviour of solitary agents confronting a natural environment. 
What concerns the economist is rather the interaction of economic agents 
whose actions affect one another, and the outcome of such interactions.! 
Thus, while a first step in any economic reasoning deduces the action that an 
agent will choose for a given specification of his objectives and constraints, a 
second step is needed in economic argumentation proper. 

This is an argument which deduces collective outcomes from the solu- 
tions to the decision problems of isolated agents. This second stage proceeds 
by assuming that the collective outcome will be a state of equilibrium. By 
this is meant a situation in which no agent would have any reason to change 
his action, if he were to take the actions chosen by other agents as given. 
Given knowledge of what each agent would choose under each possible set 
of environmental constraints (z.e., each possible configuration of actions on 


1 The distinction I wish to make is the one that Hayek [1937] draws between ‘the Pure Logic 
of Choice’ and ‘economics as a social science’. 


504 Cristina Bicchieri 


the part of other agents)—the result of the first argument—it is then possible 
to deduce which joint actions, if any, would constitute equilibria. If the 
argument from the assumed existence of equilibrium is to have any pre- 
dictive power it is necessary (as in the case of the isolated agent) for the 
economist to specify agents’ beliefs about their environment. 

But an agent’s environment, in this case, includes the actions undertaken 
by other agents. While a Robinson Crusoe has simply to learn about a 
natural environment—1.e., what the ‘laws’ governing the environment are— 
agents in a market economy have to learn about other agents’ behaviour, at 
least in the aggregate. Few would argue that it is untenable to assume that 
the natural environment possesses the properties necessary in order for 
epistemic rationality on the part of isolated optimising agents to be possible. 
The use of epistemic rationality in arguments from equilibrium is more 
problematic. 

In the classic example of equilibrium reasoning—the theory of perfectly 
competitive equilibrium—no such necessity arises. In that case, the only 
information required by each agent in order to choose his optimal action is 
the set of prices at which he can trade. When markets exist for all the relevant 
goods and trading occurs simultaneously in all markets, the problem of 
the agent’s knowledge of the prices that define the options available to him 
is a trivial one. The beliefs of agents, however, become crucial to prediction 
when the classic competitive equilibrium framework is generalised, as in 
the case of non-existent futures markets or imperfectly competitive markets. 

Let us consider the case in which individuals’ decisions depend upon 
their expectations about the future. This can occur for a variety of reasons. 
Household saving decisions depend upon expectations of future income 
levels, portfolio decisions depend upon expectations of the future values of 
the various available assets, the investment decisions of firms depend upon 
expectations of the future demands for their products, and so on. These 
expectations are beliefs about future values of variables which are endo- 
genous to the system; thus individuals’ expectations affect the dyna- 
mics of those variables by leading people to act in ways that influence the 
behaviour of these endogenous variables. 

In order to predict the future values of these endogenous variables, the 
economist has to predict what people will do, which is tantamount to saying 
that he has to make some hypothesis about people’s beliefs. Agents’ beliefs, 
however, are about other people’s actions, which in turn depend on what 
these other people believe. In an interactive environment, when actions are 
largely future-oriented, economic agents are more in the position of a social 
scientist than of a natural scientist, having to assume both practical and 
epistemic rationality on the part of others in order to predict their actions. 
The economist, then, can formulate criteria for epistemic rationality on the 
part of agents only if he can formulate the criteria for epistemic rationality 
that those agents must be able to impute to other agents in seeking to 
forecast their behaviour. 


Rationality and Predictability in Economics 505 


To be sure, economic theorists have long recognised the need for an 
explicit specification of expectations in their models. However, they have 
often dealt with this problem in ways that deny the necessity of an hypo- 
thesis of substantive rationality of the process by which agents form their 
beliefs. In the next two sections, I discuss the main strategies that econo- 
omists have adopted for dealing with the problem, and their limitations. 


3 EXPECTATIONS AS A DATUM 


One approach argues that only formal rationality of expectations need be 
assumed for economic prediction and explanation (Hicks [1939]). Expec- 
tations are treated in the same manner as preferences: the theory need claim 
nothing but that agents’ expectations are logically consistent, and that each 
agent’s expectations are formed according to some formula that is to be 
inferred empirically in each particular case. The content of agents’ expec- 
tations is thus left quite arbitrary. Yet complete agnosticism with regard to 
the content of expectations vitiates economic prediction.! 

That is, when agents assign probabilities to future events, it is not enough 
to assume these beliefs to be weakly rational. If one just says that agents’ 
probabilistic beliefs are bets they are willing to make about the occurrence 
of a set of events, one can derive a consistent (rational) subjective probability 
measure for each agent, provided that the agent’s choices among hypo- 
thetical bets satisfy certain axioms. If, however, this weak kind of rationality 
is all one can assume, the economist would become incapable of inferring 
what the agents’ subjective view of the future is, and thus would be at a loss 
in trying to predict agents’ behaviour. 

It is of course possible that the economist has direct knowledge of agents’ 
expectations. Then, like a theory of choice which assumes only a weak sort 
of rationality as to agents’ preferences, this treatment of expectations would 
have predictive content only where coupled with an assumption of stability 
over time of expectations. But the stability over time of expectations would 
have to mean either of two equally objectionable things. Either expectations 
are independent of experience, which is not consistent with epistemic 
rationality in any meaningful sense; or the expectations in question must 
be continuously confirmed by experience, and thus either this constitutes 
an unbelievable coincidence, or they have to satisfy substantive criteria of 
rationality, in addition to the purely formal ones assumed thus far. 

An alternative to treating expectations as independent from experience 
is to assume a particular functional form for the rule by which agents form 
their expectations. This functional form is treated as a datum about agents, 
like their preferences. The assumption of a particular functional form, with 
only asmall number of free parameters, is regarded as no more objectionable 
for purposes of applied work than the assumption of a particular functional 


' Early critics of such a position include E. Lundberg [1937], J. A. Schumpeter [1939] and 
L. M. Lachmann [1943]. 


506 Cristina Bicchieri 


form in the case of econometric estimation of utility functions or production 
functions. The most commonly used functional form is known as the 
hypothesis of ‘adaptive expectations’ (AE) (Cagan [1956]; Friedman [1957]; 
Nerlove [1958]). Briefly stated, AE says that the expected value of a given 
variable is adjusted over time in response to past forecast errors. 

To see more clearly how this hypothesis is applied, let us consider a 
simple example. Suppose that an economist believes that the price of corn 
is determined by a relation of the form: X, = a+bX;+ z Here X, is the 
price of corn in period ¢, while z, is a random variable representing—say— 
demand shocks. X7 stands for agents’ expectations of the price of corn in 
period t—1 (we may suppose that the supply of corn in period ¢ depends 
upon decisions taken in period t—1 on the basis of these expectations). In 
order to predict X, the economist has to give empirical content to Xf, which 
is unobservable. 

According to the hypothesis of AE, agents use an expectational rule 
according to which the expected value of a given variable is some function 
of the values taken by that variable in the past, specifically: 


X, = BX- +80 — PX:+ A + B(r— BY X,t =f 


where £ lies between o and 1. 
Thus, we have that 


X, = at+b(BX,_,+B(1—-PX,,.4+ °° +2, 


which, by repeated substitution, yields X, as a function of the history of 
the random variable z alone. An attractive feature of AE is that it allows 
one to relate expected unobservable variables to actual observed variables 
in previous periods; moreover it allows one to predict that, whenever the 
value of X gets to become stable, people will gradually ‘catch on’ to it. 
These qualities cannot make up, however, for the two main defects of 
this model. For one, AE is at odds with individual rationality; it assumes 
that agents use only the past history of the variable they are forecasting in 
making their predictions. However, in many cases, it would be reasonable 
to use other sorts of information. Suppose that OPEC announces a doubling 
of oil prices. From the moment the news becomes available, economists 
will predict higher inflation, and it is the case that both the news and the 
economists’ predictions are fully available to all agents in the economy. In 
this circumstance, one would expect rational agents to take into account 
this relevant piece of information in forming their forecasts. According to 
the AE model, instead, ‘individuals raise inflation expectations only after 
higher inflation has gradually fed into the past data from which they extrapo- 
late. Adjustment of expectations is very sluggish. Using such a rule, indi- 
viduals would make systematic mistakes, underpredicting the actual 
inflation rate for many periods after the oil price rise. It is not plausible 
that individuals would take no action to amend the basis of their forecasting 
rule under such circumstances’ (Begg [1982, pp. 25—6]). Unless the econ- 


Rationality and Predictability in Economics 507 


omist is ready to find a good reason why agents do not believe in OPEC’s 
announcements and economic forecasts, there is no possible justification 
for systematic disregard of new information. 

The second defect is that, in order to use the proposed approach in 
analysing the consequences of policy interventions, one needs to be able to 
assume that the expectations in question are independent of the policy 
chosen. But if the policy intervention can predictably alter the course of 
the economy, rational agents ought not to maintain expectations of the 
future state of the economy that fail to take this into account.” 


4 RATIONAL EXPECTATIONS 


Such considerations lie at the core of what has been called the ‘rational 
expectations’ hypothesis (REH). In its original formulation, it states that 
the ‘expectations of firms (or, more generally, the subjective probability 
distribution of outcomes) tend to be distributed about the prediction of the 
theory (or the “objective”? probability distribution of outcomes)’ (Muth 
[1961, p. 316]). REH thus says that subjective forecasts are based on the 
‘objective’ distribution, meaning by that the mathematical conditional 
expectation of the variable forecasted. On average, therefore, subjective 
expectations are equal to the true values of the forecasted variable.’ 

To see the difference between AE and REH, let us go back to the previous 
example. We have the same model: X, = a+bX/+2,, but now we make a 
different assumption, t.e., X; = E(X,/I,_,), where I,_, represents the 
information available to agents in period t—1, t.e. (X;_,, 2,_;, - . .). In this 
case, the subjective expectation of X, at time t— 1 is equal to the conditional 
mean for X,. X; is then equal to the actual stochastic behaviour of the 
system. Since subjective and objective probabilities are assumed to 
coincide, economic agents’ beliefs are correct, or objectively rational in 
sense (iii). 

Our model of price determination implies that 


E(X,/I,_:) = a+bX;+ ElzdI—:), 
so that, under the above rationality assumption, 


X= (1/1 ba + E(z,/1,_.)}- 


' Tt is however possible to justify AE by making recourse to a special class of stochastic 
proceases. For example, if all that can be observed is a random variable which is the sum of 
arandom walk and a white noise stochastic process, neither of which can be directly observed, 
then it becomes optimal to form expectations adaptively. Cf. J. F. Muth [1960]. 

2? This point, together with the observation that agents’ revision of their expectations in 
response to a policy change may have a significant effect on the consequences of the policy 
change, is often referred to as the ‘Lucas critique’. The classic source is R. E. Lucas [1976]. 

3 Important general references are Sargent [1979], Lucas and Sargent [1980], and Lucas 
[1981]. Begg [1982] surveys the literature through 1981. 


508 Cristina Bicchieri 
Then 
X, = at(b/1 —b){a+ E(z,/I,_,)} +2. 


If E(z,/T,_,) can be written as a distributed lag of past values of z, then in 
this case, too, X, is a function of the past history of the random variable z, 
alone. 

The fact that agents are ‘objectively’ rational in sense (iii) is often 
imputed, in REH models, to their being procedurally rational in sense (ii). 
This means, in REH terms, that agents use all the available information, 
and do not make systematic mistakes. A crucial question to ask is whether 
one definition of rational belief implies the other, i.e., whether optimisation 
(or the optimal use of information) implies that subjective and objective 
probabilities coincide. REH tells us no story of how this coincidence comes 
about, that is, we lack a plausible theory of how agents ‘learn’ to be rational 
in sense (iii). Since the coincidence of subjective and objective probabilities 
means that agents’ beliefs will not be disconfirmed by the eventual outcome, 
agents have no reason to revise them. By definition, such beliefs are equi- 
librium beliefs but, like in other economic contexts, here too equilibrium is 
assumed, not explained. 

The assumption of equilibrium beliefs raises even more serious problems. 
For subjective and objective probabilities to be equated, it is necessary that 
objective probabilities exist. It is well known that, for objective probabilities 
to exist at any point in time, there must be, at least in principle, a universe 
of realisations. The frequency interpretation of probability implies that as 
the number of repetitions gets large, the estimate of the frequency converges 
to the limiting relative frequency. Do observable data contain sufficient 
repetitions to allow for an accurate inference of the frequencies (or the 
‘objective’ probabilities) of economic events? 

While replication is possible in physical experiments, it might not be at 
all characteristic of economic processes. In fact, this is by no means a novel 
issue in economics; Keynes, for instance, carefully distinguished between 
risk and uncertainty. By risk, he meant a situation where replications occur 
and randomness can be objectively quantified, while uncertainty is a situ- 
ation in which we cannot quantify randomness; e.g., while the prospect of 
a European war is uncertain in this sense, the game of roulette is not 
(Keynes [1937]). 

Robert Lucas [1977, p. 15] stated this point extremely clearly and con- 
cluded that 


in situations of risk, the hypothesis of rational behavior on the part of agents will 
have usable content so that behavior may be explainable in terms of economic 
theory. In such situations, expectations are rational in Muth’s sense. In cases of 
uncertainty economic reasoning will be of no value. These considerations explain 
why business cycle theorists emphasize the recurrent character of the cycle, and 
why we must hope they were right in doing so. Insofar as business cycles can be 
viewed as repeated instances of essentially similar events, it will be reasonable to 
treat agents’ reacting to cyclical changes as ‘risk’, or to assume that their expectations 
are rational. 


Rationality and Predictability in Economics 509 


The difference between risk and uncertainty can be depicted by an example. 
Suppose agents know the price of oil increased and they may believe it will 
increase again. That is, they calculate the probability of a ‘regime shift’. Let 
us assume that there are only two possible regimes: either the cost of oil 
will remain unchanged at p, or it will be increased to p*. An agent i believes 
that the regime will be the same with probability z’, and will change with 
probability 1 — 2‘. The problem lies in verifying 1 — m with observable data. 
If the agent uses past annual oil prices, the estimate of 1—72 will be very 
small, since there was essentially one regime shift in 1974. If instead he 
believes the OPEC cartel represents a new structure, then the data from 
the pre-cartel regime are of no value in estimating the probability of another 
price increase. Individuals may well have different subjective evaluations 
of the probability of a regime shift, thus they will make decisions under 
uncertainty (in Keynes’ sense). As a result, rational expectations are not 
well defined. 

The view proper to rational expectations models is quite different. That 
is, it is assumed that there are no truly new events in the world, in the sense 
that all events are drawn from a stable probability distribution. The above 
example can then be rephrased as follows: there is always a probability that 
the average price of oil will increase, say p,; but if the industry forms a 
cartel, the probability of a price increase is greater, say p,. The probability 
of a price increase is thus conditional on the regime. 

On the other hand, the emergence and disappearance of cartels can be 
seen as random events drawn from a stable distribution, and thus there is 
a mega-distribution that nests the regimes. The probability structure is 
therefore fixed and the process is stationary. If the environment can be 
represented as stochastically stationary, it can be assumed that individuals 
will always be able to infer the correct probabilities from the observable 
data, and rational expectations will be operational. 

Equilibrium expectations thus require the vector of endogenous and 
exogenous variables in the economy to be a jointly stationary stochastic 
process. A stationary process is such that its probability structure does not 
change with time. If the world can be depicted as stochastically stationary, 
then it makes sense to claim that agents, being no more stupid or irrational 
than the economist studying them, condition their probabilities over the 
observable data, data that (in such a world) do exist. 

Thus expectations are always in equilibrium, in the sense of correctly 
‘mirroring’ a world which obeys stable probability laws. If this is the 
case, the economist is spared the demanding task of imputing substantive 
rationality to the agents’ learning process. In equilibrium, by definition, 
agents need never learn anything further about the laws of motion of their 
environment.’ 


1 That the absence of learning is the essence of an equilibrium state is stressed in Hahn [1974, 
p. 55]. 


510 Cristina Bicchieri 


The problematic issue is precisely that there must be a determinate law 
of motion (possibly stochastic) for economic events. The genesis of such a 
state of affairs cannot be studied, nor may the law of motion change during 
the period under study. Such ‘heroic’ assumptions readily invite criticism. 
As a matter of fact, most of the objections raised against REH concern the 
‘realism’ of assumptions, especially the extension of the idea of maxi- 
misation (proper to microeconomics) to the gathering and processing of 
information. Individuals, it is claimed, do not possess the information 
that would actually enable them to behave in accordance with rational 
expectations models. Moreover, quite apart from the obvious information 
costs, agents might well be unable to optimally use the information at their 
disposal.' No one seems to have noticed that the necessary condition for REH 
to be viable is precisely the hypothesis that the social environment obeys 
the sort of stable probability laws obeyed by the physical world. Thus, if 
an assumption has to be challenged on the ground of realism, this latter is 
a better candidate. 

But the issue is not simply an empirical one—whether observable data 
reveal a recurrent pattern. A more fundamental problem with this assump- 
tion is that it is inconsistent with the use of economic theory in policy 
analysis. For in a rational expectations world there are no truly discretionary 
policies that can be chosen by a policymaker. If there are some free par- 
ameters whose values can be set by public officials, there must exist an 
objective probability distribution for the values chosen by the officials, in 
order for agents to have subjective expectations that coincide with the 
objective distribution. But if an objective distribution exists, the action of 
the public officials must not be discretionary, and a change in order to 
achieve a different outcome cannot be contemplated. 

If, on the contrary, it remains possible for the authority to whom the 
economist offers his predictions to alter agents’ circumstances in such a 
way as to change economic outcomes, then the environment of those agents 
is not stationary in the sense described above.” Prediction of the conse- 
quences of government policy choices then must address the question of 
how agents learn about their changed environment, when it changes. This 
has several consequences. First, it becomes necessary to abandon the 
assumption of equilibrium expectations. If people learn, their beliefs are 
by definition not in equilibrium, and the problem can thus be restated as 
one of establishing under which conditions there will be convergence to a 
new expectational equilibrium. As is well known, economic theory is 
conspicuously silent on the problem of how equilibria are attained, and in 


1 A good example of this type of objection is B. Friedman [1979]. For an excellent survey of 
the literature, see R. J. Shiller [1978]. 

? Macroeconomic theorists have upon occasion recognised the paradox—that the method of 
equilibrium expectations cannot be adopted without precluding the sort of analysis of the 
consequences of interventions which is the avowed aim of macroeconomic theory—but 
refuse to follow it to its conclusion [Sargent: 1984]. 


Rationality and Predictability in Economics 511 


this respect there is no difference between the traditional market-clearing 
equilibrium concept and the expectational one. 

If the focus shifts to the conditions for convergence, it is necessary to 
provide a description of the agents’ learning process, and of the conditions 
under which it can take place. For example, a precondition for learning is 
that the policy in question be intelligible, that is, the change must be per- 
ceived as such by people. Moreover, the policy must be credible, in the 
sense that its implementation is believed to be possible. Only in the case of 
policies satisfying these requirements could economic theory yield pre- 
dictions as to policy consequences. But this is not all. An additional part of 
each agent’s environment that has changed is the state of knowledge of all 
other agents, knowledge on the basis of which they act. Hence a predictive 
economic theory must describe how agents learn the beliefs of others, which 
include the beliefs of these others about the beliefs of others and so on. 

The problem of knowledge of other people’s beliefs is crucial during 
learning. While they learn, agents modify their behaviour. Knowledge, that 
is, changes the economic environment by promoting actions in certain 
directions. Thus each agent, in order to be able to predict, must know other 
agents’ expectations, since their actions will depend on expectations, and 
the aggregate economic outcome to be forecasted is nothing but the result 
of all these actions. In a truly nonstationary world, what an agent is guessing 
is not just, say, the price level in period t+ 1, but also the relevant exogenous 
variables at +1 which will determine the price level in that period. 

One of those will be the so-called ‘average opinion’ and the agent, in 
order to predict the future price level, should be able to correctly estimate 
it.! That is, in the absence of ‘objective’ probabilities to which subjective 
forecasts would converge, what other people believe becomes a matter of 
foremost importance, because the price level in t+1 will crucially depend 
on the actions taken by individuals on the basis of their subjective beliefs. 

What matters to each agent is thus what he thinks the ‘average opinion’ 
about the price level at t+ 1 is. Since everyone knows everyone else is trying 
to guess what the average opinion is, one must form expectations about the 
average opinion about the average opinion, and so on to higher and higher 
degrees. Such an infinite regress would lead to choices made on the basis 
of subjective beliefs that, because of this radical uncertainty, cannot possibly 
converge to any ‘objective’ probability distribution. However substantively 
rational individuals’ beliefs may be, there is no guarantee that they will turn 
out to be correct. Even if, for argument’s sake, we suppose that the regress 
can be truncated by the existence of a centralised agency telling agents what 
the average opinion is, the problem would remain that agents might not 
trust the results, or might not be sure whether other agents believe what 
they are told, and thus might not yet be able to forecast other people’s 
expectations. 


1 For a discussion of these problems, see R. Frydman [1982] and R. Frydman and E. S. Phelps 
[1983]. 


512 Cristina Bicchteri 


It is certainly possible to sidestep all these difficulties, together with the 
question of how agents acquire the knowledge that allows them to attain a 
new equilibrium. For example, it might be argued that, after a change in 
policy regime, agents’ beliefs and actions instantaneously arrive at a new 
equilibrium, which is to say a configuration of beliefs and plans of action 
that no agent will ever have any reason to change so long as the policy regime 
continues. If a learning process exists that achieves rapid convergence to 
a new equilibrium, then questions about the effects of deliberate policy 
interventions would indeed make sense. However, not only has such a 
process not yet been devised but, even more important, nothing guarantees 
that there might not be a multiplicity of configurations of beliefs and plans 
of action, each of which would be an equilibrium under the proposed policy 
regime. 

Thus, even if one might not want to abandon the equilibrium assumption, 
this indeterminacy means that it is impossible for this assumption alone 
to yield a predictive theory. A predictive theory of the sort needed for 
macroeconomic policy analysis, therefore, requires a theory of how agents 
learn, which would allow the theorist to predict which equilibrium, if any, 
the economy converges to after a policy change. Without a plausible story 
about how an equilibrium is reached, macroeconomic theory cannot hope 
to be able to generate the policy prescriptions which are its ultimate goal. 


5 CONCLUSIONS 


To conclude, it appears that there exists a tension between economics’ own 
methodological goals and the conceptual apparatus which has been devised 
to satisfy them. I have shown how arguments from rationality have to 
include epistemic rationality for a ‘positive’ economic theory to be viable. 
Methodological individualism treats economic phenomena as endogen- 
ously determined by individual actions. These actions depend on agents’ 
beliefs, thus prediction is possible only in so far as economists are able to 
know these beliefs. In the absence of data about what people ‘really’ believe, 
their beliefs have to be inferred, which means that substantive epistemic 
rationality has to be imputed to agents. Such a strong rationality require- 
ment, however, is only possible under very restrictive conditions, i.e., that 
the economy is in a stationary equilibrium. But if a stationary expectational 
equilibrium is assumed, one of the main objectives of positive economics— 
the capability of offering policy prescriptions—is defeated. 


ACKNOWLEDGMENTS 


I wish to thank Duncan Foley, Roman Frydman, Daniel Hausman, Mary 
Hesse, Martin Hollis, Isaac Levi, Alexander Rosenberg, Amartya Sen, 
Michael Woodford and the participants in seminars at Columbia University 
and the University of Pittsburgh, who have provided useful comments on 
earlier drafts. 

University of Notre Dame and University of Chicago 


Rationality and Predictability in Economics 513 
REFERENCES 


BEGG, D. H. K. [1982]: The Rational Expectations Revolution in Macroeconomics. Johns 
Hopkins. 

BICCHIERI, M. C. [1984]: Facts, Values, and Positive Knowledge in Economics. Cambridge 
University, Ph.D. dissertation. 

BICCHIERI, M. C. [1985]: ‘Rationality, Expectations, and Positive Economica.” Discussion 
Paper no. 287, April. Columbia University. 

CAGAN, P. [1956]: “The monetary dynamics of hyperinflation’, in M. Friedman (ed.), Studies 
in the Quantity Theory of Money. University of Chicago Press. 

FRIEDMAN, B. [1979]: ‘Optimal expectations and the extreme information assumptions of 
“rational expectations” macromodels’, Journal of Monetary Economics, $, pp. 23-41. 

FRIEDMAN, M. [1953]: Essays tn Positive Economics. University of Chicago Press. 

FRIEDMAN, M. [1957]: A Theory of the Consumption Function. Princeton University Press. 

FRYDMAN, R. [1982]: “Towards an understanding of market processes: individual expectations, 
learning and convergence to rational expectations equilibrium’, American Eco- 
nomic Review, 72, pp. 652-68. 

FRYDMAN, R. and PHELPS, E. S. (eds.) [1983]: Individual Forecasting and Aggregate Outcomes. 
Cambridge University Press. 

Hann, F. [1974]: ‘On the notion of equilibrium in economics’. Inaugural Lecture, Cambridge 
University. Reprinted in F. Hahn, Equilibrium and Macroeconomics. Basil Blackwell 
(1984). 

HAYEK, F. [1937]: ‘Economics and knowledge’, Economica, 4, pp. 33-54. 

Hicks, J. R. [1939]: Value and Capital. Oxford University Press. 

KEYNES, J. M. [1937]: “The general theory of employment’, Quarterly Journal of Economics, 
BI, pp. 209-23. 

LACHMANN, L. M. [1943]: “The role of expectations in economics as a social science’, Eco- 
nomica, 37, pp. 12-23. 

Lucas, R. E. [1976]: ‘Econometric policy evaluation: a critique’, in Lucas (1981). 

Lucas, R. E. [1977]: ‘Understanding business cycles’, in Lucas (1981). 

Lucas, R. E. [1981]: Studies in Business Cycle Theory. MIT Press. 

Lucas, R. E. and SARGENT, T. J. (eds.) [1980]: Rational Expectations and Economic Practice. 
University of Minnesota Press. 

LUNDBERG, E. [1937]: Study tn the Theory of Economic Expansion. London. 

Muth, J. F. [1960]: ‘Optimal properties of exponentially weighted forecasts’, Journal of the 
American Statistical Association, 55, pp. 299-306. 

MutTH, J. F. [1961]: ‘Rational expectations and the theory of price movements’, Econometrica, 
29, pp. 315-35. 

NERLOVE, M. [1958]: ‘Adaptive expectations and cobweb phenomena’, Quarterly Journal of 
Economics, 72, pp. 227-40. 

SARGENT, T. J. [1979]: Macroeconomic Theory. Academic Press. 

SARGENT, T. J. [1984]: ‘Expectations, autoregressions and advice’, AER Papers and Proceed- 
ings, 74. 

SCHUMPETER, J. A. [1939]: Business Cycles. New York. 

SHILLER, R. J. [1978]: ‘Rational expectations and the dynamic structure of macroeconomic 
models: a critical review’, Journal of Monetary Economics, 4, 1-44. 


Brit. J. Phil. Sci. 38 (1987), 515-525 Printed in Great Britain 515 


What Price Spacetime Substantivalism? 
The Hole Story 


by JOHN EARMAN and JOHN NORTON 


Introduction 

Local Spacetime Theories 

What ts Spacetime Substantivalism? Denial of Leibniz Equivalence 
The Verificationist Dilemma 

The Indeterminism Dilemma 


a & WN 


Spacetime substantivalism leads to a radical form of indeterminism within a very 
broad class of spacetime theories which include our best spacetime theory, general 
relativity. Extending an argument from Einstein, we show that spacetime sub- 
stantivalists are committed to very many more distinct physical states than 
these theories’ equations can determine, even with the most extensive boundary 
conditions. 


I INTRODUCTION 


Since the time of Newton, those who hold a substantivalist view of space 
and time have had to address the following dilemma. They must either 


(a) allow that there are distinct states of affairs which no possible obser- 
vation could distinguish or 
(b) give up their substantivalism. 


Thus Leibniz asked Clarke how the world would differ if God had placed 
the bodies of our world in space in some other way, only changing for 
example East into West. There would be no discernible difference. Our 
belief that there was a difference would be based on the ‘chimerical sup- 
position of the reality of space itself? (Alexander [1956], p. 26). In the 
modern context, an analogous dilemma arises for spacetime substantivalists. 
But with the demise of the verifiability criterion of meaning, it is no longer 
unfashionable for them to escape the dilemma by simply allowing (a). 
Substantivalists were led to this dilemma through their insistence that 
unobservable spatial and temporal properties of matter (e.g. ‘is at position 
x’) are not reducible to observable relational properties of matter (e.g. 
coincidence, betweenness). Relationists seize upon what they regard as a 
superfluous inflation of their ontology and force substantivalists to commit 


Recetved December 1986 


516 John Earman and John Norton 


themselves to the distinctness of observationally indistinguishable states of 
affairs. 

In the context of modern spacetime theories, this overcommitment leads 
substantivalists to a new dilemma. Either they must reject substantivalism 
or they must accept a very radical form of indeterminism. The examination 
of how this dilemma arises is the subject of this paper. 

The class of spacetime theories concerned is a very wide and important 
one. In brief the theories posit a differential spacetime manifold upon which 
fields are defined. The behaviour of the fields is determined exclusively by 
partial differential field equations. The class includes Newtonian spacetime 
theories with all, one or none of gravitation and electrodynamics; and special 
and general relativity, with and without electrodynamics. What is most 
significant is that all versions of our best theory of space and time, general 
relativity, belong to the class. Thus substantivalists must face the indeter- 
minism dilemma if they believe our best theory of space and time. 

In developing the dilemma, we shall see that the equations of these 
theories are simply not sufficiently strong to determine uniquely all the 
spatio-temporal properties to which the substantivalist is committed. The 
type of indeterminism involved will be a very radical one indeed. Given 
some neighbourhood of spacetime we shall see that these theories cannot 
uniquely determine the fields within the neighbourhood from even the most 
exhaustive prescription of the fields outside of it. This is true no matter 
how small the neighbourhood. We have christened this behaviour ‘radical 
local indeterminism’. We believe that this radical form of indeterminism is 
a very heavy price to pay for a doctrine that adds no new predictive power 
to our spacetime theories. 

The indeterminism dilemma arises from a very general form of gauge 
freedom in the spacetime theories discussed. This gauge freedom manifests 
itself in the general covariance of the theories’ equations. General covari- 
ance can be understood in the usual passive sense as the form invariance 
of these equations under arbitrary spacetime coordinate transformation. 
Viewed passively, the choice of a gauge is merely a restriction on the 
spacetime coordinate systems which can be used. This obscures the con- 
nection between determinism and the gauge freedom. However the dual 
active interpretation of general covariance makes the connection much 
clearer. It is expressed as a gauge freedom in the theory’s models.! 

That this freedom could lead to radical local indeterminism was dis- 
covered by Einstein late in 1913 in the form of the so-called ‘hole 
argument’.” He did not see how to deal with the resulting dilemma until 


' See Stachel [1985] for a treatment of general covariance in this active sense. Stachel also 
maintains a distinction between absolute and dynamic objects and focuses on the concerns 
of Einstein’s ‘hole argument’. 

? See for example Einstein and Grossmann [1913], pp. 260-1, and a clearer version in Einstein 
[1914], pp. 1066-7. Stachel was the first to see clearly that Einstein’s active reading of general 
covariance made the ‘hole argument’ non-trivial. Stachel [1980]. 


~~ 


What Price Spacetime Substantivalism? 517 


late in 1915. Our purpose here is not to present an historically faithful 
version of Einstein’s argument, which has been discussed elsewhere (see 
Norton [1987]). We intend our argument to stand by itself, although we 
wish to make its ancestry known. 


2 LOCAL SPACETIME THEORIES 


We begin by describing the general form of spacetime theories in which we 
shall derive the indeterminism dilemma. These theories posit differentiable 
manifolds on which geometric objects are defined at every point. A model 
of one of these theories will always be an n+ r tuple ¢(M,O,,..., O,>. M 
is a differentiable manifold with all the usual intrinsic structure and O,,..., 
O, are n geometric objects, defined everywhere on M, for some positive 
integer n. 

Each model will satisfy a set of field equations, which are just the van- 
ishing of a subset of the objects defined. That is for some positive integer 
k less than or equal to n, the field equations are 


O, = 0, Ok41 =0,...,O, =0 


We require that each of the objects in the field equations be tensors. 

Since we allow that some of the objects can be constructed from others 
already defined, this prescription is sufficiently general to include versions 
of just about every classical field theory of interest to us. For example 
special relativistic electromagnetics has models of the form 


<M, Bab» D,, Fy, j, D, gre Roca» DaF ep D,F*—j*> 


ga 18 a metric tensor of Lorentz signature, D, a derivative operator, Fy, the 
Maxwell field tensor, j* the charge flux and R",,4 the curvature tensor of the 
metric g,,. For this version of the theory, the field equations are the van- 
ishing of O; to Og. The vanishing of O; adapts the derivative operator to 
the metric and the vanishing of O, forces g, to be flat. The final two 
equations are Maxwell’s equations. 

We shall call a spacetime theory a ‘local spacetime theory’ if it has the 
above form and satisfies the completeness condition: 


Completeness condition If a spacetime theory has models of the form 
<M,O,,...,0O,> which satisfy field equations 


O, = O, O14; =0,...,O, =0 


then every n-++1 tuple of this form which satisfies the field equations is a 
model of the theory. 


We consider only local spacetime theortes 


The dilemmas developed below arise in local spacetime theories. The 
premier instance of such a theory is our current best theory of space and 


518 John Earman and John Norton 


time, general relativity. All known formulations of general relativity are local 
spacetime theories or formulations which reduce to one.' Thus a spacetime 
substantivalist who believes general relativity cannot avoid the dilemmas. 

Virtually every other classical spacetime field theory can be formulated 
as a local spacetime theory. We prefer wherever possible to formulate them 
as such. So we take special relativity to have models < M,g,,,R",.g > where 
ga can be any of many possible Minkowski metrics definable on M, which 
satisfy the field equations R*,,, = O. Thus the completeness condition is 
satisfied. i 

This is by no means a universal practice, especially in older work. Alter- 
nately, one could insist that special relativity deals with just one Minkowski 
spacetime, which is a pair < N,n,,>, where ng, is a particular Minkowski 
metric defined on N, an R‘ manifold. What is worrisome about this alternate 
portrayal of special relativity is that it starts out by making unnecessary 
global assumptions. We must stipulate in the laws of the theory itself what 
the global manifold topology is to be and incorporate in these laws one of 
the many Minkowski metrics definable on the manifold. 

The success of general relativity has promoted the formulation of 
spacetime theories as local spacetime theories. Such formulations make com- 
parison between general relativity and these other theories much easier.” 

We also believe that there are good but not compelling reasons to for- 
mulate spacetime theories as local spacetime theories. Cosmology has 
always been a far riskier enterprise than local physics. Since the time of 
Aristotle, we have found that the weakest part of a physical theory is the 
global cosmological assumptions it makes. We have learned to our cost that 
it is better to do local physics first and build one’s cosmology from it, rather 
than the other way round. In rendering theories as a local spacetime theory, 
we abide by this heuristic. We determine all the fields on the manifold by 
local field equations, not global stipulation, and we allow the possibility of 
global topologies other than the usual standby of R”. 


What represents spacetime? 


What structure in spacetime theories represents spacetime? That is, of what 
does the spacetime substantivalist hold a substantivalist view? We view the 
manifolds M of the models as representing spacetime. 

This view follows naturally from the local formulation of spacetime 
theories. We take all the geometric structure, such as the metric and deriva- 
tive operator, as fields determined by partial differential equations. Thus 
we look upon the bare manifold—the ‘container’ of these fields—as space- 


' A variational formulation of general relativity does not have tensor field equations, as 
required by local spacetime theories. However such field equations are readily derived from 
its basic action principle. 

2 For formulations as local spacetime theories of many versions of Newtonian and special 
relativistic spacetime theories, see Friedman [1983]. 


What Price Spacetime Substantivalism? 519 


time. A repeated problem in the literature on spacetime substantivalism is 
a failure to specify clearly the structure to which substantival properties are 
ascribed. A welcome exception is Friedman [1983], chapter VI, where the 
manifold is identified as spacetime and it is argued that we should hold a 
realist view of it. 

The advent of general relativity has made most compelling the identi- 
fication of the bare manifold with spacetime. For in that theory geometric 
structures, such as the metric tensor, are clearly physical fields in space- 
‘time.' The metric tensor now incorporates the gravitational field and thus, 
like other physical fields, carries energy and momentum, whoge density is 
represented by the gravitational field stress-energy pseudo-tensor. The 
pseudo-tensorial nature of this quantity has made its status problematic. 
But it can still be seen that energy and momentum are carried by the metric 
in a way that forces its classification as part of the contents of spacetime. 
Consider, for example, a gravitational wave propagating through space. In 
principle its energy could be collected and converted into other types of 
energy, such as heat or light energy or even massive particles. If we do not 
classify such energy bearing structures as the wave as contained within 
spacetime, then we do not see how we can consistently divide between 
container and contained. We might consider dividing the metric into an 
unperturbed background and a perturbing wave in the hope that the latter 
alone can be classified as contained in spacetime. This move fails since there 
is no non-arbitrary way of effecting this division of the metric. Finally, 
classifying the metric as part of the container spacetime leads to trivial- 
isation of the substantivalist view in unified field theories of the type 
developed by Einstein, in which all matter is represented by a generalised 
metric tensor. For there would no longer be anything contained in space- 
time, so that the substantivalist view would in essence just assert the inde- 
pendent existence of the entire universe. 

In an alternate view usually associated with Newtonian or special rela- 
tivistic theories, one represents spacetime by the manifold with some 
additional geometric structure, which we shall call its absolute structure. 
This view arises most naturally in the older non-local formulations of 
spacetime theories, in which case the absolute structure is typically posited 
globally ab initio rather than being defined locally through field equations. 
Thus if one gives the above global formulation of special relativity, one 
would probably call spacetime the pair < N,n,, >. Or if one had to identify 
a structure in a Newtonian spacetime which corresponded to the thing 
about which Newton held his substantivalist view, then that would be the 
tuple < N,h”,D,,dt, >, where bh" is the degenerate metric, D, the derivative 
operator and dt, the absolute time one form (all defined in the usual 
manner). 


' One of us has argued that a primary outcome for Einstein of the principle of equivalence 
was the recognition that the Minkowski metric g,, of special relativity was a physical field 
defined in spacetime, rather than a part of the background of spacetime. Norton [1985]. 


520 John Earman and John Norton 


Our present argument does not address this representation of spacetime 
since it is commonly associated with non-local spacetime theories already 
beyond our compass. We note in passing that the hybrid view—using this 
representation of spacetime within local spacetime theories—-still leads to 
dilemmas of the type discussed below, but they are harder to set up. (See 
note 2 on p. 522.) 


The Gauge Theorem 
The indeterminism dilemma depends on the following theorem: 


Gauge Theorem (General covariance):' If <M,Q;,...,O,> is a model of a 
local spacetime theory and h is a diffeomorphism from M onto M, then 
the carried along tuple <M,h*O,;,...,h*O,> is also a model of the 
theory. 


Proof We need to establish that the vanishing of the field equations 
O, = 9, Ons: =0,..., 0, =0 


is preserved under diffeomorphism. This follows immediately from the 
description of the action of the carry along h* in coordinate terms. For 
any object O; with components (O;)™ in some coordinate system {x™} we 
have 


(0)™ = (h*0;)™ 


where the superscript m’ indicates components in the carried along coor- 
dinate system {x™} = {h*x™}. Recall that O; is a tensor. Therefore 
(OP = o and thus (h*O,)™ = o as well. Therefore h*O, vanishes. This 
argument holds for i=k,k+1,...,n, which establishes that the field 
equations hold for the carried along tuple. 


Notice that the proof depends on the objects being tensors, which have the 
property of vanishing just in case their components vanish in any coordinate 
system. This is why we restricted the field equations of local spacetime 
theories to tensorial equations. : 

We shall say that the original model and the carried along model are 
diffeomorphic. Note that the relation of being diffeomorphic divides the 
set of tuples into equivalence classes. 

To see the connection between this gauge theorem and general covariance 
in its usual passive reading, recall that there is a natural one-one corre- 
spondence between diffeomorphisms on M and coordinate transforma- 
tions of a particular coordinate system {x™} of M. Let the diffeomor- 
phism h map the point p of M to hp. Then the corresponding coordinate 


1 This result is not new, although it is commonly known through its passive form. Wald writes 
‘the diffeomorphisms comprise the gauge freedom of any theory formulated in terms of 
tensor fields on a spacetime manifold’. Wald [1984], p. 438. 


What Price Spacetime Substantivalism? 521 


transformation assigns the new coordinates {x™} to p, where the values of 
{x™'} at p are equal to the coordinates of hp in the original coordinate system 
{x}. 

Using this correspondence one can translate theorems from the active to 
the passive language—that is from theorems dealing with diffeomorphisms 
to theorem dealing with coordinate transformations—and wice versa. The 
gauge theorem follows immediately from the vanishing of the carry along 
under arbitrary diffeomorphism of vanishing tensors. This result corre- 
sponds to the passive result that the components of these zero tensors 
remain zero under arbitrary coordinate transformation, which is just the 
generally covariant transformation law for the components of a zero tensor. 


3 WHAT IS SPACETIME SUBSTANTIVALISM?: 
DENIAL OF LEIBNIZ EQUIVALENCE 


In broad outline, the spacetime substantivalist holds that spacetime can 
exist independently of any of the things in it. In this form, the thesis is 
disastrous, because it is automatically denied by every spacetime theory 
with which we deal. They all postulate that there are always fields at every 
point in spacetime. That is, they agree that there cannot be unoccupied 
spacetime events, contrary to the standard position taken by substantivalists 
against relationists. 

We can imagine many less problematic ways of reformulating the sub- 
stantivalist thesis. We might consider the thesis that spacetime is not reduc- 
ible to other structures; or the thesis that we must unavoidably quantify 
over spacetime events in our spacetime theories. Perhaps we might consider 
a strict realist reading of the models of spacetime theories. Each model no 
longer represents a physically possible world. Rather each model is a physi- 
cally possible world, one of them being our world. That is the M of one 
model of a true spacetime theory is the spacetime of our world. 

Fortunately we do not need to settle this reformulation problem. What- 
ever reformulation a substantivalist may adopt, they must all agree con- 
cerning an acid test of substantivalism, drawn from Leibniz. If everything 
in the world were reflected East to West (or better, translated 3 feet East), 
retaining all the relations between bodies, would we have a different world? 
The substantivalist must answer yes since all the bodies of the world are 
now in different spatial locations, even though the relations between them 
are unchanged. 

The necessary agreement of substantivalists on this test is all we shall 
need to arrive at the dilemmas below. But first we must translate the test 
into the context of local spacetime theories. The diffeomorphism is the 
counterpart of Leibniz’ replacement of all bodies in space in such a way that 
their relative relations are preserved. For example, represent two bodies in 
a local spacetime theory by two spatially small regions of high energy 


522 John Earman and John Norton 


density in the obvious way. Then all their relative properties, such as 
the spacetime interval separating them and their relative velocities upon 
collison, remain unchanged under arbitrary diffeomorphism. 

In sum, substantivalists, whatever their precise flavour, will deny: 


Leibniz equivalence Diffeomorphic models represent the same physical 
situation. 


This denial already places substantivalists at odds with standard modern 
texts in general relativity, in which this equivalence is accepted unques- 
tioningly in the specific case of manifolds with metrics.! We are now in a 
position to establish the two dilemmas for spacetime substantivalists.” 


4 THE VERIFICATIONIST DILEMMA 


This dilemma amounts to little more than a restatement of the sub- 
stantivalists’ denial of Leibniz equivalence. To complete the dilemma we 
need only note that spatio-temporal positions by themselves are not observ- 
able. Observables are a subset of the relations between the structures defined 
on the spacetime manifold. Thus we cannot observe that body b is centred 
at position x. What we do observe are such things as the coincidence of 
body b with the x mark on a ruler, which is itself another physical system. 
Thus observables are unchanged under diffeomorphism. Therefore diffeo- 
morphic models are observationally indistinguishable. 

Substantivalists must either deny Leibniz equivalence or deny their 
substantivalism. That is, they must either 


(a) accept that there are distinct states of affairs which are observationally 
indistinguishable, or 
(b) deny their substantivalism. 


5 THE INDETERMINISM DILEMMA 
To arrive at this dilemma, we need a simple corollary of the gauge theorem: 
Hole corollary Let T be a model of some local spacetime theory with 


1 Hawking and Ellis [1973], p. 56; Sachs and Wu [1977], p. 27. This acceptance enables 
modern treatments of local spacetime theories to avoid radical local indeterminism. Older 
treatments of classical mechanics and special relativity were not formulated as local spacetime 
theories. This type of indeterminism was not a problem since they dealt with a single 
manifold plus absolute structure as the fixed spacetime canvas in which the gauge freedom 
cannot arise. 

? Manifold-plus-absolute-structure substantivalists will typically face dilemmas of similar 
origin. M-p-a-s substantivalists are subject to the Leibniz acid test just in case their absolute 
structure has symmetries, which is overwhelmingly the case. They must deny that two 
models represent the same physical situation if they are diffeomorphic under a symmetry 
transformation. This naturally generalises to the denial of Leibniz equivalence (and the 
dilemmas below). The generalisation is difficult to avoid. Translational symmetries, for 
example, can be composed out of hole diffeomorphisms. Thus the affected m-p-a-s sub- 
stantivalists must deny Leibniz equivalence at least for hole diffeomorphisms, which already 
is sufficient to yield the dilemmas through the hole corollary of Section 5. 


What Price Spacetime Substantivalism? 523 


manifold M and H (for hole) any neighbourhood of M. Then there exist 
arbitrarily many distinct models of the theory on M which differ from 
one another only within H. 


Proof Leth be a ‘hole diffeomorphism’, i.e., one which differs from the 
identity diffeomorphism within H, but smoothly becomes the identity 
on the boundary and outside H. Then from the gauge theorem, the carry 
along of T under h satisfies the requirement. Since there are arbitrarily 
many hole diffeomorphisms for H, there are arbitrarily many such carry 
along models satisfying the requirement. 


The name of this corollary stems from Einstein’s original discovery of it in 
a specialised form. He considered a matter free hole in a source mass 
distribution and showed that the gauge freedom of any generally covariant 
gravitational field equation in general relativity allowed multiple metric 
fields within the hole. 

It now follows immediately that the substantivalists’ denial of Leibniz 
equivalence leads to a very radical form of indeterminism in all local space- 
time theories, since for a substantivalist the diffeomorphic models of the 
hole corollary must represent different physical situations. 

Consider first various forms of Laplacian determinism. Suppose that the 
spacetime models in question admit global time slices.’ In the Newtonian 
setting such a slice is a hyperplane of absolute simultaneity, while in the 
relativistic setting it is a spacelike hypersurface without edges. The Lapla- 
cian would then like to prove that the laws of physics guarantee that the 
state on a time slice S uniquely fixes the state to the future of S; or failing 
that, the state on a finite sandwich lying between two slices S and S’ fixes 
the state to the future of the sandwich; or failing that, the state on S and to 
the past of S fixes the state to the future of S. 

If spacetime is substantival, no such proof can be forthcoming within 
local spacetime theories. For by the hole corollary with The Hole placed 
in the future of S, if < M,O,,O,,...> is a model of our theory, then there 
is another model < M,O’;,0’3,...) which is identical with the first up to 
and including the instant corresponding to S (#.e., for any p in M which 
lies to the past of S, O,(p) = O’;(p)) but which diverges from the first to the 
future of S. 

It is worth noting that, contrary to the common wisdom, Laplacian 
determinism typically does not obtain a clean form in Newtonian theories. 
See Earman [1986]. Intuitively, Laplacian determinism breaks down 
because there is no upper bound on the velocity of causal propagation, 
with the result that influences can ‘sneak in’ from spatial infinity without 
announcing themselves on the chosen slice S. 

In the face of such space invaders one might hope to achieve a non-trivial 
form of determinism by shifting from a pure initial value problem to a 


' Otherwise, the global version of Laplacian determinism does not apply. 


524 John Earman and John Norton 


boundary-initial value problem. That is, the state is specified on S itself 
and also the walls of a tube which cuts through all the time slices in the 
future of S. The hope is that these boundary conditions will determine a 
unique interior for the tube amongst the models of the theory. But assuming 
substantivalism, the hole corollary dashes these hopes. Just place The Hole 
within the tube. 

By now the reader has no doubt seen that the hole corollary forces 
substantivalists to conclude that no non-trivial form of determinism can 
obtain in local spacetime theories. The state within any neighbourhood of 
the manifold can never be determined by the state exterior to it, no matter 
how small the neighbourhood and how extensive the exterior specification. 

Of course this radical local indeterminism can be escaped easily by just 
accepting Leibniz equivalence. Then the diffeomorphic models of the hole 
corollary represent the same physical situation and the indeterminism dis- 
cussed becomes an underdetermination of mathematical description with 
no corresponding underdetermination of the physical situation. But accept- 
ing Leibniz equivalence entails denying substantivalism. 

We emphasise that our argument does not stem from a conviction that 
determinism is or ought to be true. There are many ways in which deter- 
minism can and may in fact fail: space invaders in the Newtonian setting; 
the non-existence of a Cauchy surface’ in the general relativistic setting; 
the existence of irreducibly stochastic elements in the quantum domain, 
etc. Rather our point is this. If a metaphysics, which forces all our theories 
to be deterministic, is unacceptable, then equally a metaphysics, which 
automatically decides in favour of indeterminism, is also unacceptable. 
Determinism may fail, but if it fails it should fail for a reason of physics, 
not because of a commitment to substantival properties which can be 
eradicated without affecting the empirical consequences of the theory. 

In sum we have shown that substantivalists must either deny Leibniz 
equivalence or deny their substantivalism. That is, they must either 


(a) accept radical local indeterminism in local spacetime theories or 
(b) deny their substantivalism. 


Perhaps it is acceptable to save substantivalism in the verificationist 
dilemma by accepting option (a). But we feel that the price one has to pay 
in accepting the option (a) in the indeterminism dilemma is far too heavy a 
price to pay for saving a doctrine that adds nothing empirically to spacetime 
theories.” 


Dept. of History and Philisophy of Science, University of Pittsburgh 


' See Hawking and Ellis [1973] for a definition of thie concept. 

? We have not concluded here that spacetime is relational, since the literature contains so 
many conflicting usages of the term ‘relationism’. Of course the conclusion is established if 
relationiam is just the negation of substantivalism. But this presupposes far too crude a 
dichotomy—substantivalism versus relationism—from which discussion of these matters 
has suffered too long. Relationism is not established if it implies that all motion is the relative 
motion of bodies, as Leibniz apparently held. 


What Price Spacetime Substantivalism? 525 
REFERENCES 


ALEXANDER, H. G. [1956] (ed.): The Leibniz—Clarke Correspondence. Manchester University 
Press. 

EARMAN, J. [1986]: A Primer on Determinism. Western Ontario Series in Philosophy of Science, 
vol. 32. D. Reidel. 

EINSTEIN, A. [1914]: ‘Die formale Grundlage der allgemeinen Relativitaetstheorie’, Preuss. 
Akad. der Wiss., Sitz., pp. 1030-85. 

EINSTEIN, A. and GROSSMANN, M. [1913]: ‘Entwurf einer verallgemeinerten Rela- 
tivitaetatheorie und einer Theorie der Gravitation’, Zeitschrift fuer Mathematik und 
Physik, 63, pp. 225-64. 

FRIEDMAN, M. [1983]: Foundations of Space-Time Theories. Princeton University Press. 

HAWKING, S. and ELLIS, G. F. R. [1973]: The Large Scale Structure of Space-time. Cambridge 
University Press. 

NORTON, J. [1985]: ‘What was Einstein’s Principle of Equivalence?’, Studies in History and 
Philosophy of Science, 16, pp. 203-46. 

NORTON, J. [1987]: ‘Einstein, the Hole Argument and the Reality of Space’, in J. Forge (ed.), 
Measurement, Realism and Objectivity. Reidel. 

Sacus, R. K. and Wu, H. [1977]: General Relativity for Mathematicians. New York: Springer. 

STACHEL, J. [1980]: ‘Einstein’s Search for General Covariance’. Paper read at the Ninth 
International Conference on General Relativity and Gravitation, Jena. 

STACHEL, J. [1985]: ‘What a Physicist Can Learn from the Discovery of General Relativity’. 
Proceedings, Marcel Grossmann Meeting, Rome, 1985. 

WALD, R. [1984]: General Relativity. University of Chicago Press. 


Brit. J. Phil. Sct. 38 (1987), 527-532 Printed in Great Britain 527 


Are There Logical Limits For Science? 
by E. M. ZEMACH 


Rescher has presented a proof that a completed science is logically impossible; not 
every truth can be known. I show that the proof is valid only if it is read de re. One 
of its premises, however, is an obvious truth only on a de dicto reading; read de re 
it is false. What the proof shows, therefore, is that science has no limits and any 
true proposition can be known. We can, however, know it only in the meagre de re, 
and not in the informationally rich de dicto, sense of ‘know’. 


“Perfected science is a mirage; completed knowledge a chimera,” claims 
Rescher! in an argument (following similar proofs by Fitch, Routley,? and 
others) intended to demonstrate that total knowledge is logically impossible. 
In order to prove that, Rescher asks us to “‘consider the following four 
plausible-seeming theses: 


(Kı) Authentic knowledge is inherently veridical: Kp —> p. 

(K2) A conjunction can only be known if both its conjuncts are known: 
K(p&q) > (Kp&Kq). 

(K3) Some truth is not known: (3p) (p&-Kp). 

(K4) All truths are knowable: p > PKp. 


It is readily demonstrated,” Rescher continues, “‘that these four theses are 
inconsistent: 


. K-Kp—-Kp substitution in Kı 

. -(K-Kp&Kp) from (1) 

. K(-Kp&p) ~ (K-Kp&Kp) substitution in (K2) 

- -K(-Kp&p) from (2), (3) 

. N-K(-Kp&p) from (4) by necessitation (p being arbitrary) 
. -PK(-Kp&p) from (5) 

. -(-Kp&p) from (6), (K4) 

. VWp)-(Kp&p) from (7) by generalization 

. -Gp)(-Kp&p) from (8). 


OO ON AN PWD H 


Received June 1986 


| N. Rescher, The Limits of Science, University of California Press [1984], p. 150. 

? F, B. Fitch, ‘A Logical Analysis of some Value Concepts’, Yournal of Symbolic Logic, 28, 
Pp. 135-42 [1963]. 

? R. Routley, ‘Necessary Limits to Knowledge: Unknown Truths’, in E. Moracher et al. (eds.), 
Essays in Scientific Philosophy, Bad Reichental [1981], pp. 93-113. 


528 E. M. Zemach 


But (9) contradicts (K3).”’ Rescher concludes that since K1—-K4 generate a 
contradiction, and K1—-K3 are above reproach, the culprit is K4 and it 
should be rejected: “We must concede that some truths are unknowable” 
(ibid.). 

Does Rescher prove, by means of logic alone, that some truths are 
unknowable? I do not think so. Instead of showing that completed science 
is impossible, another interesting thesis, that I shall spell out later on, has 
been proven. First, however (since it has been misunderstood’), let me 
make clear what the proof says, primarily, what ‘Kp’ means in it. As 
Schlesinger’ rightly saw, it can only mean, ‘it is now known that p’. For if 
‘Kp’ is read, ‘at some time or other we know that p’, premise K3 is the 
statement that some propositions will never be known, which is hardly a 
truism, and contradicts the perfectibility of science hypothesis. It is thus 
wrong to read K4 as saying, ‘if p, then it is possible that we know that p at 
some time or other’. It is also impossible to obtain that reading of K4 by 
treating its modal operator as a quantifier over times (in the real world), 
i.e., by reading K4 thus: ‘if p, then at some time or other we know that p.’ 
Under such interpretation of the modal operators, (5), and all later steps in 
the proof, are invalid, for there is no reason to believe that if now we do 
not know that p, then we shall never know that p. Rescher can hardly be 
accused of that fallacy. Thus, in this proof ‘K? means ‘it is now known’, ‘P? 
has its usual sense (semantically, a possible worlds quantifier), and K4 
means, ‘any true proposition is such that it is logically possible that we 
know it now’. 

But Schlesinger is wrong to think that the said fact, t.e., that ‘K’ means 
‘it is now known’, justifies rejecting K4. He writes: ‘““The final conclusion 
‘(Ap) p&-PKp)’ states nothing more noteworthy than that there is some true 
proposition which at the present time is in principle unknowable.” But that 
result is very noteworthy indeed; it is a major discovery that it is now 
logically impossible to know some true theorem! Far from refuting Rescher, 
Schlesinger has in fact conceded his point, i.e., that K4 is false. 

It is a misunderstanding to construe the thesis that Science has Logical 
Limits (SLL) as a claim, that there is no future time t such that at t every 
fact is known to us. For that prediction the above proof is quite irrelevant, 
for premise K3 assumes that some true proposition is in fact unknown to us; 
but at the said future time ¢ it may very well be the case that all facts are 
known to us. Rather, the SLL thesis is that not every true proposition is 
knowable now. K4 says that SLL is false; the fact that science is not now 
complete is accidental; although there are true propositions that we do not 
know, it is not logically impossible that we would have known them. 


1 Cf., e.g., the discussion of the said proof and intuitionism, in W. D. Hart, ‘Access and 
Inference’, Proceedings of the Aristotelian Society, suppl. vol. 53, pp. 53—65 [1979], and in 
T. Williamson, ‘Intuitionism Disproved?’, Analysis, 42, pp. 203—7 [1982]. 

? G. Schlesinger, ‘On the Limits of Science’, Analysis, 46, pp. 24-6 [1986]. 


Are There Logical Limits For Sctence? 529 


Incompleteness, if true, is not necessary. That is K4, the thesis that Rescher 
attempts to refute. 

The case against K4 does look very strong. As interpreted above, K4 
claims that there is some possible world w such that every proposition that 
is true in the real world, is known in w. (The weaker claim, that each actually 
true proposition is known in some possible world or other, is useless against 
SLL; we need a world in which all actually true propositions are known.) It 
is certainly not true that for every sentential operator F there is a possible 
world w such that, if it is true that p in the real world, ‘p&Fp’ is true in w. 
Take, e.g., ‘In Rome, p’; no possible world w contains all the facts of the real 
world, and the fact that they all occur in Rome. Or take ‘p, for no reason’; 
there is no possible world w where all actual facts obtain, and yet they all 
have no reason. Rescher’s argument is, then, that knowledge is not special: 
‘We know that p’ is like these other sentential operators, and there is no 
possible world w where we know that p, iff it is true that p in the real world. 

In the present article I want to defend K4, arguing that it ts logically 
possible to know everything. It is going to be quite a detective story: we have 
a foul contradiction, and my client, K4, is the prime suspect. I have to show 
that K4 did not do it, and then who did do it, and how. So let me try to get 
K4 off the hook by giving it an alibi: remove it, and see whether I can still 
get a contradiction from the first three premises only. That attempt will fail, 
but its failure is highly instructive. Here is my attempt. By Existential 
Instantiation, derive from K3 


10. q & -Kq. 


Since we have validly derived (10) from an allegedly true premise, we know 
it to be true, t.e., 


11. K(q & -Kq) 
Apply Kz, and use Kr on the second conjunct; we get 
12. Kq & -Kq 


that is, a flat contradiction. If that proof works the source of trouble must be 
somewhere else, K4 is innocent, and a perfect science may be possible after 
all. But the proof does not work. Rescher was careful not to instantiate in 
K3, but I was not. It is obvious, then, that such an instantiation is not 
allowed. But why not? EI is a good rule of logic; what makes this context 
unreceptive to it? 

‘K’ is an epistemic operator; epistemic operators sustain two different 
interpretations, de dicto and de re. If ‘K’ is read de dicto it is a sentential 
operator, the letter ‘p’ is a schematic letter substitutable for sentences, and 
what the assumed knower S is said to know is the content of the proposition 
expressed by the sentence that replaces ‘p’. On the other hand if ‘K’ is read —~ 


de re it is a predicate, the letter ‘p’ is a variable ranging over propositioris)?. 
PAD gO 


-2 p NS 


530 E. M. Zemach 


and what S is said to know is that the proposition which is the value of ‘p’ is 
true (he need not know its content). So far I, like all others who dealt with 
this proof, read it de dicto; that is an easier and more intuitive reading. But 
is it correct? There are indications that the de dicto reading is wrong. One 
is the presence of quantification in K3, (8) and (9): to quantify one needs 
variables, not schematic letters; thus ‘p’ must be a variable substitutable by 
names of propositions. Another indication is given by the abortive proof (10)- 
(12) above. If ‘p’ in K3 is replaced by a sentence, as soon as we put that 
sentence down we know what it says, and ‘-Kp’ is false. The only way to 
make sense of K3 is to take ‘p’ as replaceable by names of propositions, not 
by sentences that express them, t.e., by reading it de re. 

Let me elaborate that point. Suppose that Jones has forgotten what the 
Pythagorean Theorem is all about, but still knows that it is true. Is ‘Kp’ 
true of Jones? Read de dicto it is not, for Jones does not know the content 
of the proposition p,' expressed by the sentence (‘the sum of the squares ...’ 
etc.) that is substituted for the letter ‘p’ in ‘Kp’. But read de re ‘Kp’ is true, 
for Jones does know, of the proposition p to which he can refer by name 
(‘the Pythagorean Theorem’), that it is true. 

Suppose that Jones does not know that p, and thus ‘-Kp’ (read de dicto) 
is true of him. Now ‘-Kp’ expresses a proposition; can Jones himself know 
it, i.e., know that he does not know that p? We do attribute such knowledge 
to people, so ‘K-Kp’ is unproblematic; but let us see what are its truth 
conditions. If ‘-Kp’ is read de dicto, the ‘p’ is replaced by the Pythagorean 
Theorem itself, not by the words ‘The Pythagorean Theorem’. That makes 
nonsense out of ‘K-Kp’, for if Jones knows the content of the theorem p 
he does know that p after all, and ‘K-Kp’ (read de dicto) is false. That 
is why the truth conditions of ‘Jones knows that he does not know the 
Pythagorean Theorem’ do not require that Jones should know the content 
of the theorem; he should only refer to the theorem (say, by the name “The 
Pythagorean Theorem’), and justifiedly believe of it, that he does not know 
what its content is. Thus, the only reading that makes sense of ‘K-Kp’ is 
de re. 

I conclude that although ‘-Kp’ may be read de dicto, if it is embedded 
in another de dicto epistemic operator relative to the same person it must 
be read de re. ‘Kg-Kyp’ (i.e., ‘Smith knows that Jones does not know 
that the car is stolen’) is true iff Smith knows the content of p, and that 
Jones does not know it. No problem with a de dicto reading there. But for 
‘Ky-Kyp’ to be true, it is not required that Jones knows the content of p 
(that the car is stolen) and that he does not know it. The only way for Jones 
to know that he himself does not know that p is to refer to p by name, and 
believe that he does not know what is the content of the proposition that 
name stands for. Free substitution in an epistemic context is permitted only 


' I shall use the roman ‘p’ as a schematic letter substitutable by sentences, and the italicised 
‘p as a name of a proposition. 


Are There Logical Limits For Science? 531 


if the epistemic operator is read de re. Thus the corpse is found where it is 
least expected: substitution is so routine that it seems utterly unexciting, 
but here it leads us to the killer. 

Given a de re reading, one may still try to load the more natural de dicto 
meaning into the operator ‘K’. For even if ‘p’ is a variable ranging over 
propositions one may stipulate that ‘Kp’ be true iff S knows the content of 
the propositon which is the value of ‘p’. Let us see whether that can be 
done. Define 


Kp = df. S knows the content of the proposition p. 
Tp = df. The proposition p is true. 


Now let us reproduce the proof, if we can. 


(Kr1*) Kp Tp 

(K2*) K(p&q) > (Kp & Kq) 

13. K(-Kp) > T(-Kp) from K1*, p/-Kp 

14. -[K(-Kp), & -T(-Kp)] from 13 

15. -[K(-Kp) & T(Kp)] from 14 

16. K(-Kp & Tp) > from K2*, p/-Kp, q/Tp. 


K(-Kp) & K(Tp) 
But that is all. We cannot go any further, for the next step, 
17. -K(-Kp & Tp) from 15, 16 by Modus Tollens, 


does not follow. Unlike Rescher’s proof, ours is stuck at this point, for 
“T(Kp)’ is not equivalent to ‘K(Tp)’. ‘T(Kp)’ says: the proposition, S knows 
the content of the proposition p, is true. ‘K(Tp)’ says: S knows the content of 
the proposition, the proposition p is true. The two are mutually independent: 
‘K(Tpy does not imply (as “T(Kp)’ does) that S knows the content of p; it 
only says that S knows the content of p ts true, and as we saw one may know 
the content of the proposition the Pythagorean Theorem is true without 
knowing the content of the Pythagorean theorem. On the other hand, 
“T(Kp)’ does not imply (as ‘K(Tp)’ does) that S knows that the theorem he 
knows is called ‘p’; one may know the content of the Pythagorean Theorem 
(that the sum of the two squares . . . etc.) without knowing that it is called 
“The Pythagorean Theorem’. There is no way, therefore, to load ‘Kp’ in 
such a way that it can be given a de dicto meaning. The only way to read it 
on which the proof is still coherent and valid is de re. 

That the proof has to be read de re helps us find the culprit, and solves 
the Whodunit riddle. I shall now name the knave: it is the innocent looking 
K2. Read de re, ‘K(p&q)’ says: S knows, of the proposition p&q, that it is 
true. S does not have to know the content of that proposition; he should 
only refer to it by some name (say ‘Ruth’) and justifiedly believe (de dicto) 
that it is true. That S may do without believing either that p is true, or that 
q is true; he may have never heard of these theorems. How can S know that 


532 E. M. Zemach 


Ruth is true, if he does not know that p is true and that q is true? Perhaps 
someone told him. If Goedel tells me that the theorem which he calls ‘Ruth’ 
is true, then I know, of Ruth, that it is true, although I do not know that 
Ruth has the logical form, ‘p&q’. Goedel knows the content of the theorem 
Ruth, t.e., he knows that p&q; therefore, he knows that p and he knows 
that q. But I do not; I know nothing of p and of q, since although these 
theorems are logically implied by a theorem which I believe is true, I am 
ignorant of their very existence, and hence I can have no beliefs about them. 
One cannot know the content of Proust’s Remembrance of Things Past 
without knowing the content of the novels Swan’s Way, Within a Budding 
Grove, etc., of which it consists. But one may perfectly well know the 
content of the proposition, Remembrance of Things Past ts a true story 
without knowing, or even believing, the content of the proposition Swan’s 
way is a true story, although the latter is entailed by the former. Thus K2 
read de re is false.' 

Let me quickly take care of two objections. One may hold the following 
view on de re belief: that if b is part of a, and you believe something of a, 
then you also believe something of b. Under that odd view K2 read de re is 
true. But under that view we know, de re, everything . Here is how: call the 
conjunction of all true propositions ‘Tom’; you believe that Tom is true; 
therefore by the said view you believe, of every proposition, that it is true. 
Your belief is justified, and hence you know every proposition (de re). In 
that case, however, (9) is true, K3 is false, the paradoxality of the proof 
disappears, and K4 is left intact. 

The other objection is that being but finite humans, we can never refer 
to all propositions, hence we cannot have a belief (de re) of all of them, and 
K4 is false. That confusion about ‘all’ has been already exposed by the 
medieval logician John Buridan. Of course we can refer to any proposition; 
there is no proposition such that we cannot make up a name for it. (Every 
chair is such that you can sit on it, even though there are too many chairs 
for you to sit on all of them.) Thus once more the proposition that we can 
know every proposition is unscathed. 

Since K4 was cleared of all charges, I conclude that we can know every- 
thing, including propositions about what we do not know. That is possible 
because we only need to know these propositions’ names rather than their 
content. Rescher’s proof is therefore valid, but its conclusion is false, 
because under the only reading that it can sustain, a de re reading, its most 
(seemingly) harmless premise is unacceptable. 


The Hebrew University of Jerusalem 


1 For more on that point see my “Transparent Belief, Australasian Journal of Philosophy, 60, 
pp. 55-65 [1982], ‘Speaking of Belief’, ibid., 63, pp. 78-83, ‘De Re and Descartes: a New 
Semantics for Indexicals’, Nous, 19, pp. 181-204 [1985], and ‘Out of the Belief Trap’, 
Erkenntnis, forthcoming. 


Brit. J. Phil. Sci. 38 (1987), 533-549 Printed in Great Britain 533 


A Verisimilar Ordering of Theories 
Phrased in a Propositional Language 
by CHRIS BRINK and JOHANNES HEIDEMA 


Introduction 

Power Relations 

Power Ordering of Proposttional Formulae 
Theories Phrased in a Propositional Language 
Power Ordering as a Verisimilar Ordering 


a & WN 


INTRODUCTION 


(ml 


Popper introduced the concept of verisimilitude in the early sixties; Miller 
and Tichy deflated his definition in the early seventies. These facts, and 
the ensuing debate on verisimilitude, have been well chronicled in the pages 
of this Journal, as a look at, for example, Oddie [1981] and Urbach [1983] 
will show. Various attempts have been made to construct an acceptable 
definition of verisimilitude; these have mostly centred around the idea of 
distance from the truth. But this shared approach has not led to any sub- 
stantial agreement. Moreover, of late there has been growing pessimism 
concerning the very possibility of success. This is due mostly to an argument 
first raised in Miller [1974], which says that our intuitions concerning 
verisimilitude are dependent on the language in which theories are 
expressed. 

In this paper we do not tread any of the well-worn paths: this is not 
another de novo investigation of the issues involved in verisimilitude. It is 
rather the serendipitous application to this concept of a certain construction 
applied to algebraic structures. Namely, to any structure there corresponds 
its power structure, essentially built up by taking the power relation of each 
relation in that structure. And for any relation R between elements of a set 
A, its power relation R* relates subsets of A in a way dependent on R. It 
turns out that there is a natural power relation to be found between formulae 
of a propositional Janguage. We offer this relation for consideration as a 
verisimilar ordering of theories phrased in a proposition language. Our 
approach, then, if we are to be charged with one, is to think of verisimilitude 
as an ordering relation between theories, and to present a model of this 
relation. We agree that an acceptable definition of verisimilitude should 
lead to an ordering of theories phrased in a first-order language. Never- 
theless, for a start, to illustrate the ideas involved, and because the subject 


Received October 1986 


534 Chris Brink and Johannes Heidema 


has its own intrinsic interest, we limit ourselves here to the finite pro- 
positional case. . 

In Section 2 we introduce power relations, and specialise forthwith to 
power orderings. In Section 3 we bring to light the power ordering of 
propositional formulae; in Section 4 ditto for theories phrased in a pro- 
positional language. In Section 5 we advance some reasons for adopting 
power ordering as a verisimilar ordering. One good reason is that it allows 
a formulation of some general principles concerning verisimilitude. 
Another is the rehabilitation, in a limited context, of Popper’s original 
conception of verisimilitude. A third is that, on our construction, Miller’s 
puzzle does not arise. 


2 POWER RELATIONS 


Corresponding to any set A is its power set P(A), having as elements all 
subsets of A. Given any binary relation R defined over a set A, Chris Brink 
has defined ([1986]) its power relation R* over P(A) in the following way: 
for any elements X and Y of P(A) (that is, any subsets of A), X is 
R*-related to Y iff every element of X is R-related to some element of Y, 
and also to every element of Y some element of X is R-related. Formally: 


XRtY iff (VxeX)(JyeY)[xRy] & (VyeY)(4xeX)[xRy] 


To grasp the idea behind this definition, consider the case where R is 
some kind of ordering < , say for example the partial order pictured in 
Figure 1. Here A = {a, b, c, d, e, f, g,h}; an arrow from x to y indicates that 
x < y, and for clarity identity and transitivity arrows are suppressed. Then 
the power ordering <* orders subsets of A in a way which makes precise 
the intuitive idea of one subset being ‘above’ or ‘below’ another (possibly 
overlapping) one with respect to the given ordering < . Thus, for example, 
X = {b,c,e} is ‘below’ Y = {c,f,g} in the sense that for every element of 
X there is some element of Y which is greater than or equal to it, and for 
every element of Y there is some element of X which is smaller than or 
equal to it. So X<*Y, and similarly (eg.) Y<*Z={h}, and 
A <*Q = {a,b,d,h}. Thus the power ordering of subsets of A takes place 
with respect to the given ordering of elements of A. By contrast, the con- 
ventional set-theoretical ordering of inclusion (©) completely ignores the 
existing ordering (if there is one under consideration). The concept of 
power relation, therefore, leads in the case of ordered sets to a novel ordering 
of subsets. 

Without attempting to be systematic or exhaustive we collect here some 
facts concerning power relations relevant to the purpose of this paper. First, 
the introduction of power relations has a motivation independent of the 
verisimilitude enterprise. Namely, it is an essential ingredient in the concept 
of power structure, which is a generalisation of the known universal-algebraic 
concept of power algebra. Details and references can be found in Brink 


Verisimilar Ordering of Theories 535 


wh 
| X = {b,c,e} 
.8 
y bi Y = {c,f,g} 
. e e f ; 
\ ra Xe Z= {h} 
“Cc od 
ye 4 Q ‘a {a,b,d,h} 
< a ~b 


Figure 1. 


[1986]. Second, any relation R is embedded in its power relation R*. That 
is, there is a subrelation S of R* which is isomorphic to R. This is in fact 
easy to see: note that for any elements a and b of A the singleton sets {a} 
and {b} are related by R* iff aRb, so clearly the mapping f: A —> P(A) given 
by f (x) = {x} is the required isomorphism. In a sense, then, a power relation 
extends a given relation from elements to subsets. Third, some properties 
of relations are preserved under the ascent to power relations, and some 
are not. For example, reflexivity and transitivity are preserved, but anti- 
symmetry is not (e.g., in Figure 1 A<*+Q and Q<?tA, nevertheless 
A # Q). This means that the power order of a partial order is in general 
not again a partial order. Fourth, the power order of a partial order is always 
a quast~order. By definition (Rasiowa [1974], for example), any reflexive and 
transitive relation is a quasi-order. Fifth, also worth noting is the absence 
of connectedness: it is not the case that any two subsets of A must be in 
some way Rt-related. In Figure 1 for instance, Y ẹ*Q and also Q ¢TY. 
Non-connectedness holds in particular for the empty set @. It is not related 
to any subset of A other than itself because of the existential quantifiers in 
the definition of R*. Sixth, if we start with an unordered set of A, its power 
set P(A) is partially ordered by inclusion ©, hence the power relation & * 
of inclusion quasi-orders # (# (A)). So the concept of power ordering is 
always applicable to sets of subsets of a given set. In this paper we make 
use of precisely such a 3-level system: unordered set A, # (A) endowed 
with ©, and 9 (P(A)) endowed with S+. 

Since power relations appear in this paper mainly in the guise of power 
ordering we introduce for convenience the following notation: the power 
ordering <* of an ordering < will be indicated by the symbol —. Thus: 


2.1. Definition. If A is a set ordered by < then the power ordering = of 


536 Chris Brink and Johannes Heidema 


aS 
wa 


Figure 2. 


a b 


< is that relation between subsets of A such that for any subsets X 
and Y of A 


XY if WeeX)GyeY[x<y] & (Wye YGxeX)[x<yl.O 


The convenience of this notation lies in the fact that is the intersection 
of two weak power orderings, given by <= and =, where 


XY iff (Vxe X)GyeY)[x<y] 
(and we read ‘Y reaches higher than X’), and 
X=Y iff Wye Y)GxeX)[k<y] 


(and we read ‘X reaches lower than Y’). For example, and to exemplify again 
the interplay between inclusion and power ordering, if we denote for the 
moment by <= the power relation = * of the identity relation = over a set 
A, then = is inclusion © and = is inclusion 2 between subsets of A, s0 
that <= is in fact identity on P(A). So inclusion is itself a weak power 
ordering, and the conventional ordering of subsets thus appears as an 
instance of the unconventional ordering introduced here. Note further that 
the notation lends itself to easily interpretable variations. Thus, 


X=>Y mean YX 

XY means X&Y and X>Y 
X=>Y means Y=X 

XY means X&Y and X>Y 
etc. 


We conclude this section with an example which we will refer back to. 
Let B4 = {0, a, b, 1} be the (base set of) the four-element Boolean algebra; 
it is then partially ordered by < as indicated in Figure 2. Then #2 (B4) 
under the conventional ordering of inclusion © has the ordering shown in 
Figure 3, but #(B4) under the power ordering = has the ordering shown 
in Figure 4. The two orderings are entirely different. Figure 4 serves to 
illustrate again some of the facts listed above. That the ordered structure 
(B4; < ) is embedded in (P? (B4); =) is clear by looking at the structure 
exhibited by the singleton sets {o}, {a}, {b} and {1} in Figure 4. That anti- 
symmetry fails is indicated by the double-headed horizontal arrows. And 
the empty set, as always, is —-related only to itself. 


Verisimilar Ordering of Theories 537 


{0,a,b, 1} 





{0,a}- "{b, 1} 


3 POWER ORDERING OF PROPOSITIONAL FORMULAE 


In this section we apply the idea of power ordering to n-variable pro- 
positional formulae by viewing them as sets of subsets of a given set. 

Let {p1, Pa - - -, Py} be a set of propositional variables. Propositional for- 
mulae A, 8, @... are built up out of these variables and the connectives 
~(not), & (and), v (or), = (if... then . . .) and < (iff) in the conventional 
way. Up to truthfunctional equivalence there are 2° such formulae, each 
of which may be expressed in disjunctive normal form (DNF). Each DNF 
is a disjunction of some k primitive conjunctions (+)p; & (+)p. &...& 
(+)p,, where k = 0, 1,...,2" and the symbol (+) before a propositional 
variable indicates that there may or may not appear a negation sign in 
that position. Following conventional terminology we call these primitive 
conjunctions constituents. Each constituent corresponds in one-one fashion 
to a subset of {p1, p2,...,,}, namely the set made up of all and only those 
variables appearing without negation signs in the constituent. Each DNF 
then corresponds in one-one fashion to a set of sets, namely the set of all 


subsets of {p1, po, - - - , Da} Corresponding to the constituents appearing in it. 
Since there are 2" subsets of {p,, 2, . - - , Day there are 2° sets of these sets, 
one for each DNF. 


The correspondence between propositional formulae and sets of subsets 
of {p1, Pn - - - , Dx} can also be explained in terms of truth tables. We need 


538 Chris Brink and Johannes Heidema 


{1} 
fa, 1}, c> {b, 1} 


va 


{0,1} X {0,a,b,1} — {b}S-{0,b, 1}, 


{0,a,b} 


ONE 
NY 


Figure 4. 





only consider truth tables of 2" rows, since any formula of m < n variables 
may be represented up to truthfunctional equivalence by some n-variable 
formula. Each row in the truth table of such a formula £ is generated by 
precisely one assignment of truth values to the variables in f. Each 
such assignment may be viewed as a characteristic function defined 
over {b1, P2, --- Pa}, and each such characteristic function determines in 
one-one fashion a subset of {p;,P2,...,P,}- Now select all and only those 
subsets corresponding to a row with True in the main column, then we 
have a set of subsets of {p;, f,...,,} which represents &. 

This correspondence between formulae (for the rest of this section we 
drop ‘n-variable’ and/or ‘propositional’) and sets of sets is in fact an iso- 
morphism. The set corresponding to ~# is the complement with respect 
to P({p1, Po, -.-,Px}) of the set corresponding to &; the set corresponding 
to of & J is the intersection of the sets corresponding to £ and to J, and 
the set corresponding to f v £ is their union. This being so we are justified 
in identifying a formula with the set representing it, and speaking (e.g.) of 
the elements of a formula. 

_We now arrange formulae in the 3-level system foreshadowed in Section 
2. At Level I there is the unordered set {1, p2, . - . , Pa} of atomic formulae. 
At Level II there is P({p1,p2,.--,P,}), with subsets of {p; p2,...,P,} as 


Verisimilar Ordering of Theories 539 


elements. These are (i.e., correspond to) the constituents; they are ordered 
by inclusion. At Level III there is P(P({p1, P2» ...,P,})), consisting of sets 
of subsets of {p,,)2,..-,P,}, and these are the n-variable formulae. The 
hierarchy is cumulative: all the formulae appearing at Level I and Level II 
also appear at Level III. That is, their representatives do. Namely, all the 
elements appearing at Level I are represented at Level II by their singleton 
sets, and in the same way Level II elements are represented at Level III. 
Moreover, the ordering at Level II has no effect on Level I representatives: 
they remain unordered. And the ordering at Level III orders the Level II 
representatives in the same way as their originals are ordered. 

Since Level II is ordered by inclusion © we may consider Level III as 
being ordered by its power order =. That is: 


= iff for every element of £ there is some element of # which 
includes it, and for every element of # there is some element 
of £ included in it. 


We conclude this section with an example: the case where n = 2. There 
are 16 truthfunctionally distinct formulae f(p, q) in two variables p and q. 
Each has a truth table of four rows, each of which corresponds to a subset 
of {p,q}. Thus: 


Set pq fp, q) 





Any two-variable formula is represented by the set of sets corresponding 
to those rows with T in the main column. Thus, for example, 


p &q corresponds to {{p, g}} 


pvq corresponds to {{p,g}, {p}, {Q} 
~p corresponds to {{g}, Ø} 


etc. 


The four subsets of {p, q} exhibit under inclusion the ordering of Figure 
5, which is isomorphic to the ordering of B4 pictured in Figure 2. The 


7% 
v4 
A 
TA. 
ng 


~p &~q 





Figure 6. 


power order of P(A ({p, q})) is therefore isomorphic to the power order of 
B4 pictured in Figure 4. For convenience we reproduce it in Figure 6 with 
the new annotations. To the reader we leave the easy but instructive task 
of writing each formula in DNF. 


4 THEORIES PHRASED IN A PROPOSITIONAL LANGUAGE 


We have now brought to light an unconventional but natural ordering of 
propositional formulae. In what sense is this a verisimilar ordering of 
theories phrased in a propositional language? Come to think of it, what ts 
a theory phrased in a propositional language? We explain as follows. 

To begin with we postulate the existence of a finite number of atomic 
facts F,,F,,...,F,. We attempt no examination of the nature of atomic 
facts; instead we put some constraints on our use of this concept. First, 
when we speak of an atomic fact we mean that it ‘is the case’, or ‘obtains’. 
Second, each atomic fact is asserted by some (true) atomic sentence, which 
says that the fact obtains. We indicate these atomic sentences by 
P, P.,...,P, respectively. Third, the usual connectives may be applied 
iteratively to these atomic sentences. Thus if P, and P, assert atomic facts 
F; and F;, respectively, then ~P; asserts (falsely) that F, does not obtain, 
P, & P, asserts (truly) that both F, and F; obtain, and P, v P, asserts (truly) 


Vertsimilar Ordering of Theories 541 


that at least one of F; or F, obtains. Fourth, atomic facts are logically 
independent: there is no sense in which the assertion that some of them do 
or do not obtain entails the assertion that another different one obtains or 
does not obtain. 

We think of a world as being made up entirely of atomic facts. Different 
sets of atomic facts determine different worlds. For the present discussion 
the real world is made up of the atomic facts F,, F;,..., Fẹ. Iterated appli- 
cation of the connectives to the atomic sentences P,, P2,...,P, asserting 
them results in compound sentences A,B,C,.... The build-up here is 
entirely analogous to that of propositional formulae x, 8, €,... from pro- 
positional variables p;, p2,...,~,- We exploit this by referring for any sen- 
tence A to its propositional form, meaning the analogous propositional for- 
mula of. (Technically: we are exploiting an isomorphism between two word 
algebras.) We think of any sentence A as a theory concerning the real world. 
These theories are phrased in a propositional language, meaning that each 
has a propositional form. The propositional form of a theory is unique up 
to a truthfunctional equivalence. There are 2° such theories and, being 
sentences, each is either true or false. 

Theories, like propositional formulae, appear at the third of three levels. 
At Level I there is the set of atomic sentences asserting the atomic facts 
comprising the real world. At Level IT there are what we will call diagrams 
(we borrow the term from model theory). A diagram is a conjunction 
(4)P; &(4)P2&...& (+)P,, with (+) again indicating the presence or 
absence of a negation sign. Diagrams, therefore, correspond to constitu- 
ents—but keep in mind that diagrams are sentences and constituents are 
formulae. We suppose that to each diagram there corresponds in one-one 
fashion a possible world: a world which, were it to exist, would be such that 
the diagram would be true. The real world, too, is a possible world, and in 
the context of this world the real world diagram P,&P,&...&P, is the 
only true diagram. At Level III there are the theories, represented up to 
truth functional equivalence by DNF’s. To each theory there corresponds a 
set of possible worlds, namely those corresponding one-one to the diagrams 
appearing as disjuncts in its DNF. These possible worlds are precisely the 
ones in which the theory would be true, were they to exist. Each theory is 
in fact, that is to say with respect to the real world, either true or false: true 
if it has the real world diagram as a disjunct in its DNF and false otherwise. 
Since theories are sentences we may speak of the conjunction and dis- 
junction of theories. Then, for any theories A and B, A & B would be true 
in precisely those possible worlds in which both A and B would be true, 
and A v B would be true in precisely those possible worlds in which at 
least one is true. 

We now order the theories under consideration. This is simple: for any 
theories A and B we put 


A&B if S=@, 


542 Chris Brink and Johannes Heidema 


where of and # are, respectively, the propositional forms of A and B, and 
<= on the right is the power ordering of Section 3. It is equally simple to 
define between theories the analogues of =, =, $, etc., and we shall 
consider this done. 

Note that diagrams are theories too—the only theories appearing on 
Level II. We may call them epistemologically definite theories, since for each 
of the atomic facts comprising the real world such a theory judges whether 
or not it obtains. Since they correspond to constituents these theories are 
ordered on Level II by inclusion (that is, by the relation induced by 
inclusion between constituents). So, by virtue of the embedding property 
of power relations, epistemologically definite theories appear on Level III 
ordered by = as if by inclusion. 

We offer the ordering = of theories phrased in a propositional language 
for consideration as a verisimilar ordering. 


5 POWER ORDERING AS A VERISIMILAR ORDERING 


In this concluding section we advance some evidence in favour of adopting 
power ordering as a verisimilar ordering. 

As a first witness we call Figure 6, which orders theories concerning a 
world built up of two atomic facts asserted by atomic sentences (say) P and 
Q. (We use the notational conventions of Section 4, and hope the reader 
will not grudge us the use of Figure 6 instead of its isomorphic counterpart 
for theories.) We find it painless to regard the ordering of Figure 6 as a 
verisimilar ordering. Thus, for example, P & Q ought to be at the top, since 
it tells the truth, the whole truth and nothing but the truth. Dually, ~P & 
~Q ought to be at the bottom, since it tells lies, all possible lies and nothing 
but lies. What about the rest? It could be argued that, for example, the 
theory P should be of lesser verisimilitude than P & Q on the grounds that 
even though it tells the truth and nothing but the truth it does not tell the 
whole truth. Moreover, it could be argued that P v Q should be of lesser 
verisimilitude than both P and Q, since although it tells the truth it remains 
uncommitted to any particular atomic fact. And so on. We are quick to admit 
that such arguments (and we regard the given ones merely as examples) can 
wear pretty thin, especially when it comes to theories such as P v Q or 
P<+Q. But, and this is the point we are making here, where the arguments 
are thin so are the counterarguments. Beyond the appeal to private 
intuitions, and pending the outcome of an opinion poll regarding shared 
intuitions, we are not aware of any conclusive evidence against adopting 
power ordering as a verisimilar ordering. 

Second, we take it as a minimal (if not minimum) requirement of any 
definition of verisimilitude that it should give rise to some kind of ordering 
of theories. This is done by =. Thus, to borrow an example from Tichy 
[1974] and [1976], let H, R and W be the sentences ‘It is hot’, ‘It is raining’ 
and ‘It is windy’, then every one of the 256 theories concerning the world 


Verisimilar Ordering of Theories 543 


made up of the atomic facts asserted by these sentences corresponds to a 
set of subsets of {h, r ,w} (lower case letters being propositional variables). 
For example, 


Ar: ~H&~R&~Weorrespondsto {Ø} 

A2: ~H&R&~W corresponds to r 

A3: ~H&R corresponds to {{r, w}, {r}} 
Ag; ~H&R&W corresponds to {{r, w}}. 


Supposing it to be in fact hot and raining and windy Tichy contends that 
as theories A1—A¥4 are in increasing order of verisimilitude. This is borne 
out by the power relation, which shows that A1 =Az2—A3= Aux. In fact, 
the power relation shows much more. For example, consider the theories 
Ags (‘It is raining or windy’), A6 (‘If it is raining and windy it can’t be hot’) 
and A7 (‘At most one of the conditions hot, raining or windy obtains’). 
Then 


As: RYW corresponds to {{h,r,w}, {h,r}, {h, w}, {r, w}, {r},{eo}} 
A6: (R& W)>~H 

corresponds to {{h,r}, {h, w}, {h}, {r, w}, s Aai @} 
A7: (H& ~R& ~W)v(~H&R& ~W)v(~H& ~R 

corresponds to {{h}, {r}, {w}}. 


And the power relation shows that A7=A6=As. Note, however, that 
not every two theories are comparable. For example, let A8 be the theory 
‘It is hot and dry and not windy’, then 


A8: H& ~R&~W correspondsto {{h}}, 


and (e.g.) A4f A8 and also A8 <f A4. Note further that two theories A and 
B can be of equivalent verisimilitude (AB) without being the same, or 
indeed having truthfunctionally equivalent propositional forms. For ex- 
ample, any two theories having propositional forms such that both A & r & w 
and ~h & ~r & ~w appear as disjuncts in their DNF’s have equivalent 
verisimilitude. All these facts are summed up by the observation that = is 
a quasi-order. Adoption of = as a verisimilar ordering thus has the advan- 
tage that we know precisely what kind of ordering we are dealing with. 
Third, adoption of power ordering as a verisimilar ordering is instructive 
in the sense that it allows a formulation of some general principles con- 
cerning verisimilitude. To show this we prove some theorems. Under con- 
sideration now is the general n-variable case outlined in Section 4. 


5.1 Theorem For any theories A and B: 


(1) If the propositional forms of A and B are truthfunctionally equi- 
valent then ASB. 

(2) A is true iff it has the real world diagram P,;& P,&...& P, as a 
disjunct in its DNF. 


544 Chris Brink and Johannes Hetdema 


(3)(a) If A is true then BA. 

(b) If A is true and B is false then A 4B (ens A%B). 
(4)(a) A & BSA and A & B&B. 

(b) A&A vB and BSAVB. 


Proof 


(1) says that truthfunctional equivalence implies verisimilar equiva- 
lence, and this is evident from the fact that truthfunctional equiva- 
lence means being represented by the same set. (Note: we have 
‘already pointed out that verisimilar equivalence does not in general 
imply truthfunctional equivalence.) 

(2) holds because the real world diagram is the only true diagram. 

(3) (a) holds by virtue of (2) and the definition of =. (b) holds because 
of (2): A has an element, namely the real world diagram, not included 
in any element of B. (Being false, B does not have the real world 
diagram as an element.) 

(4) holds because the DNF of A & B contains precisely those disjuncts 
common to both, and the DNF of A v B contains all disjuncts occur- 
ring in either. Ol 


5.2 Theorem For any theories A and B: 


(1) If A and B are both true, then AV B= AA & B. 
(2) If A is true and B is false, then 
(a) ASA&B 
(b) B and A & B may be related by = or by = or they may be in- 
comparable. 
(0) AVB=A 
(d) AVBSB. 
(3) If A and B are both false, then A and A & B may be related by <= 
or by = or they may not be comparable. Similarly for A and A v B. 


Proof (Examples are from the 2-variable case.) 


(1) Since A & B is true, A&A & B by (3)(a) of Theorem 5.1, and by 
(4)(a) also A=A & B. Hence ASA & B, and similarly A v B&A. 
(2)(a) A is true and A & B is false, so by (2) of 5.1 A has the real world 
diagram in its DNF and A & B has not, so A<EA & B. (Note: this 
means that either A & BA or they are not comparable. Examples: 
P & ~Q=P, but (Q=P) & ~(P=Q) is not comparable with 
Q=>P.) 
(b) BA & B when A is P and Bis ~Q. A & B4B when A is Q= P 
and B is P = ~Q. Band A & B are not comparable when A is Q =P 
and B is ~P Q). 
(c) A v B&A by (3)(a) of 5.1, and by (4)(b) A v B&B. 
(d) A v B is true, hence has the real world diagram in its DNF, while 
B is false and so has not. 


Verisimilar Ordering of Theories 545 
(3) is proved by exhibiting examples. This is left to the reader. O 


We now abstract from these results some general principles concerning 
verisimilitude of theories. Recall that a theory is true iff the real world 
occurs amongst the possible worlds in which it would be true; that the 
conjunction of two theories is true in precisely those possible worlds in 
which both would be true, and that the disjunction of two theories is true 
in precisely those worlds in which at least one would be true. When we 
speak in what follows of an ‘increase’ in verisimilitude, or of ‘greater’ 
verisimilitude, we tacitly include the possibility of equivalent verisimili- 
tude—otherwise we speak of a ‘strict increase’, or ‘strictly greater’. Simi- 
larly for decrease. 


5.3 Principles concerning verisimilitude 


(1) No false theory has greater verisimilitude than any true theory. 

(2) For true theories conjunction increases verisimilitude and disjunction 
decreases verisimilitude. 

(3) The verisimilitude of a true theory is not increased by conjoining to 
it a false theory. (It is either decreased or a non-comparable theory 
results.) : 

(4) The verisimilitude of a false theory may be increased or decreased or 
neither by conjoining to it a true theory. 

(5) The verisimilitude of a true theory is decreased by disjoining to it a 
false theory. 

(6) The verisimilitude of a false theory is not decreased by disjoining to it 
a true theory. (It is either increased or anon-comparable theory results.) 

(7) For false theories verisimilitude may be increased or decreased or 
neither by conjunction as well as by disjunction. go 


The first principle arises from (3)(b) of Theorem 5.1; the rest (in the 
given order) come from Theorem 5.2. The first principle deserves special 
mention. The first point to note concerning it is that it is a principle obeyed 
by Popper’s now discredited definition of verisimilitude. Since we will 
shortly come to Popper’s defence we count this as a point in our favour. 
The second point is that this principle has been called counterintuitive. As 
Andersson [1978] puts it: ‘We can imagine interesting false theories with a 
lot of true information which we would prefer to any almost trivially true 
theory.’ This may be true for Andersson in the overall context of scientific 
theories; it is not true for us in the present context. We stick to our first line 
of thought, and invite examples of false theories phrased in a propositional 
language which are intuitively closer to the truth than some true theory. 

The principles 5.3 appear to us intuitively acceptable. However, we do 
not present this conviction as evidence that power ordering should be 
adopted as a verisimilar ordering, since notoriously intuitions differ. What 
we do present as evidence in favour is the simple and presumably uncon- 


546 Chris Brink and Johannes Heidema 


tentious fact that these principles are principles of veristmilitude. That is, 
power ordering of theories may be adopted as a verisimilar ordering because 
provable statements regarding this ordering are instances of general prin- 
ciples representing a certain point of view concerning the nature of veri- 
similitude. 

Our fourth reason for advocating the adoption of power ordering as a 
verisimilar ordering is an historical one: we claim that power ordering 
rehabilitates in a limited context Popper’s original definition of veri- 
similitude. Popper’s definition, we recall (from Popper [1972], for defi- 
niteness) was phrased in terms of truth content and falsity content, and the 
idea was that increase in verisimilitude involves simultaneous increase in 
truth content and decrease in falsity content. Some care must be taken as 
to whether ‘increase’ and ‘decrease’ are strict (t.e. irreflexive) orderings or 
not, but these are matters of detail, not of substance. Popper’s own defi- 
nition makes verisimilitude a strict ordering; if we formulate for simplicity 
the analogous non-strict ordering we get that a theory A has less than (or 
equal, now) verisimilitude than a theory B iff the truth content A is included 
in the truth content of B, and also the falsity content of B is included in the 
falsity content of A. ‘Inclusion’ here, is the usual set-theoretic ©. 

The Popperian definition fits in precisely with the power ordering of 
epistemologically definite theories—that is, diagrams. Recall that a diagram 
is a conjunction (+)P;&(+)P,&...&(+)P, of atomic sentences or 
negations of such. Each one corresponds, via the notion of propositional 
form, to a subset of {p,,p,...,~,}, namely the subset consisting of those 
variables whose corresponding sentences appear without negation signs in 
the diagram. Of course this is just a formalistic device standing in for the 
real idea, which is that each diagram corresponds to a subset of the atomic 
facts making up the real world. Call this the truth content of the diagram. 
Exactly on a par with this we could also let each diagram correspond to 
that subset of {p;, Pa - - - , Pn} consisting of variables whose corresponding 
sentences appear with negation signs in the diagram. Call this the falsity 
content of the diagram. Then truth content and falsity content are comple- 
ments: on the formalistic treatment their intersection is empty and their 
union is {p1, Po, - - - , Pa}. Consequently, if the truth content of A is included 
in the truth content of B, then automatically and simultaneously the falsity 
content of B is included in the falsity content’ of A, and conversely. This 
being so, we see that for diagrams the Popperian definition of verisimilitude 
gives at Level II precisely the ordering of inclusion. But this ordering is 
embedded in the <=-ordering of Level III. And so (repeating essentially 
the point made at the end of Section 4) epistemologically definite theories 
appear at Level III ordered by <= as if by the Popperian definition of 
verisimilitude. 

Our fifth and final reason for the adoption of power ordering is that it 
escapes Miller’s puzzle. The background is as follows: In response to the 
example of Tichy cited above, Miller [1974] presents the following 


Vertsimilar Ordering of Theories 547 


argument. Call the weather Minnesotan (indicated by m) iff it is either 
simultaneously hot and wet or simultaneously cold and dry. Similarly, call 
the weather Arizonan (indicated by a) iff it is either simultaneously hot and 
windy or cold and still. Then m is truthfunctionally equivalent to h<>r and 
a is truthfunctionally equivalent to h<>w. Miller notes that: 

The three sentences h, m, a are logically independent of one another; and the eight 
constituents that can be formed from the three generators h, m, a are just the eight 
constituents that can be formed from h, r, w. Moreover, r is equivalent to h<>m, 
and w is equivalent to h<>a. There is perfect symmetry. And «here seems no 
good reason—beyond sheer prejudice—for treating the A-r-w language as more 
fundamental than the h-m-a one. 


Miller further notes that the theories Ar and A4 are translated in the 
h-m-a language as ~h & m& a(B1) and ~h& ~m & ~a (B4), respectively. 
Taking Ar to be equivalent to Br, and A4 to B4, Miller criticises Tichy’s 
definitions on the grounds that on Tichy’s approach A4 has greater veri- 
similitude than A1, yet for the supposedly equivalent theories Br and B4 
the order is reversed. Miller concludes that Tichy’s judgements are not 
invariant under translation into equivalent languages. Subsequently this 
came to be known as the problem of ‘language dependence’ of veri- 
similitude; it features prominently in the whole debate on verisimilitude. 

We now claim that the problem of language dependence does not arise 
for the power ordering of theories, phrased in a propositional language. On 
our approach, too, A1==A4 and B4<=B1 upon working with the cor- 
responding sets. Yet we are not discomfited by Miller’s (potential) charge 
that our judgements concerning verisimilitude are not invariant under 
translation into equivalent languages. 

Our complacency is based on the fact that on our approach there is 
indeed good reason—beyond sheer prejudice—to choose between the h-r- 
language and the h-m-a language. On our approach, for a given world, we 
choose one language over another as being more appropriate to that world. 
A world, we repeat, is made up entirely of atomic facts: different atomic 
facts yield different worlds. And we take the world comprised of the atomic 
facts that it is hot, raining and windy to be different from the world 
comprised of the atomic facts that it is hot, Minnesotan and Arizonan. 

We justify this position on the basis of our methodological position. As 
mathematicians we are interested in constructing models of the world—the 
teal world out there. We do not believe, Wittgenstein-like, that the real 
world out there is conveniently divided into basic building-blocks called 
atomic facts. But, as mathematicians, we do believe that we can model the 
world. Not all at once, probably, but focusing attention on one particular 
part or aspect or context at a time. In this way we build models of motion, 
population growth, cloud formation, economic patterns—or, for that 
matter, verisimilitude. And that part or aspect or context we are focusing 
on (and which, confusingly, we have called in this paper a ‘world’) we may 
attempt to carve up into atomic facts. In which case the choice of atomic 


548 Chris Brink and Johannes Heidema 


facts is our choice. Accordingly, the decision to model the weather by the 
attributes of heat, rain and wind yields a different ‘world’ than the decision 
to use the attributes of heat, Minnesotaness and Arizonaness. That is, the 
decisions yield different models. Moreover, for each model we pick a lan- 
guage appropriate to it—1.e., convenient for it, chosen above others for the 
usual considerations of simplicity, power, elegance and the like. From this 
perspective, the h-m-a language simply does not fit the world of heat, rain 
and wind. After all, it seems excessively awkward to herald rain by shouting 
‘Harki It is hot if and only if it is Minnesotan!’ 

And we do not agree that it would be ‘sheer prejudice’ to treat the 
h-r-w language as ‘more fundamental’ than the h-m-a language in order to 
describe the world of heat, rain and wind. It is chosen for that world because 
it is more appropriate to it than the h-m-a language. And the h-m-a language 
is more appropriate to the world of heat, Minnesotaness and Arizonaness, 
which is a different world. It is then in our view not surprising that Miller’s 
translation procedure can result in a reversal of verisimilar ordering. Veri- 
similitude, like truth, is defined relative to a world. For the same world 
verisimilitude, like truth, should be independent of translation into equi- 
valent languages—in this we agree with Miller. But just as the same sentence 
may be true in one world and false in another, so the verisimilar ordering 
of sentences may vary from one world to another. 


ACKNOWLEDGEMENTS 


This paper was written while the first author was a Visiting Fellow at the 
Automated Reasoning Project of The Australian National University; and 
we thank Robert K. Meyer and Michael McRobbie for making available 
this opportunity. Financial assistance from the South African Council for 
Scientific and Industrial Research and the University of Stellenbosch is 
gratefully acknowledged. Thanks are due to David Miller and to the referee, 
both of whom pointed out a rather silly mistake in an earlier version of this 
paper. In addition we thank David Miller for correspondence on the alleged 
language dependence of verisimilitude. 


Department of Mathematics 
University of Cape Town 
Rondebosch 7700 

South Africa 


Department of Mathematics 
Rand Afrikaans University 

PO Box 524, Johannesburg 2000 
South Africa 


Verisimilar Ordering of Theories 549 
REFERENCES 


ANDERSSON, G. [1978]: “The problem of verisimilitude’, in G. Radnitzky and G. Andersson 
(eds.), Progress and Rationality in Science, pp. 291-310. D. Reidel Publishing Co. (Syn- 
these Library, Vol. 125). 

BRINK, CHRIS [1986]: ‘Power structures and logic’, Quaestiones Mathematicae, 9, pp. 69-94. 

MILLER, DAVID [1974]: ‘Popper’s qualitative theory of verisimilitude’, The British Journal 
Jor the Philosophy of Science, 25, pp. 166-77. 

ODDIE, GRAHAM [1981]: ‘Verisimilitude reviewed’, The British Journal for the Philosophy of 
Science, 32, pp. 237-65. 

POPPER, K. R. [1972]: Objective Knowledge. Oxford University Press. 

Rasiowa, H. [1974]: An Algebraic Approach to Non-classical Logics. North-Holland. (Studies 
in Logic and the Foundations of Mathematics, Vol. 78.) 

TICHY, PAVEL [1974]: ‘On Popper’s definitions of verisimilitude’, The British Journal for the 
Philosophy of Science, 28, pp. 155-60. 

TicHy, PAVEL [1976]: ‘Verisimilitude redefined’, The British Journal for the Philosophy of 
Science, 27, pp. 25-42. 

URBACH, PETER [1983]: ‘Intimations of similarity: the shaky basis of verisimilitude’, The 
British Journal for the Philosophy of Science 34, pp. 266-75. 


Brit. J. Phil. Sci. 38 (1987), 551-560 Printed in Great Britain 551 


Some Problems for Bayesian 
Confirmation Theory 
by CHARLES S. CHIHARA 


Introduction 

The Old Evidence Problem 

The Counterfactual Response 

The Garber—Eells Response 

A New Problem for Bayesian Confirmation Theory 


wm th & N 


X INTRODUCTION 


I begin this paper with a problem that Clark Glymour has posed for 
Bayesians, involving the discovery that some old evidence is found to 
support a new theory. I discuss two solutions to this problem: one proposed 
by several philosophers but criticised in detail by Glymour; another set 
forth by Daniel Garber and developed by Ellery Eells. I then go on to 
describe a situation in which a new theory is found to be supported by some 
old data-—a discovery which produces a rise in confidence in the new theory, 
but which does not seem to be adequately analysable by classical Bayesian 
theory. It is the position of this paper that this difficulty for classical 
Bayesian theory is not resolvable by the Garber—Eells approach and that 
the aspect of Bayesian theory giving rise to this problem is quite different 
from the one on which Garber and Eells have focused. 


2 THE OLD EVIDENCE PROBLEM 


One of Glymour’s criticisms of Bayesian Confirmation Theory (in [1980], 
pp. 85-93) is based on the fact that some theories may gain evidential 
support from data gathered before the theory was formulated. Glymour 
gives as examples the support Copernicus’ theory had from earlier astro- 
nomical observations, the support Newtonian theory derived from already 
accepted second and third laws of Kepler, and the support Einstein’s gravi- 
tational field equations obtained from previously observed anomalies in the 
perihelion of Mercury. The reason this raises a problem for Bayesians is 
this: the Bayesian holds that an agent’s degrees of belief can be represented 
by a function, P(x), from propositions to [o, 1]. In the case of a rational 
agent, this function is held to satisfy the standard axioms of probability and, 
not surprisingly, is frequently called the agent’s (subjective) probability 
function. According to classical theorists, P(T/E), which by definition is 


Received November 1986 


552 Charles S. Chihara 


P(T & E)/P(E), gives the degree of belief in T that the agent should have 
after E is learned. So it is generally held that a rational agent will change 
his/her degree of belief in T, upon learning E (and nothing else that is 
relevant), according to the Rule of Conditionalisation, which says that the 
agent’s new degree of belief in T should equal the agent’s old degree of 
belief in (T & E) divided by the agent’s old degree of belief in FE.’ Now it 
is plausible to hold that E confirms theory T if and only if the learning of 
E would produce in a rational agent an increase in confidence in T. Thus, 
we obtain the criterion: 


E confirms T if and only if P(T/E) > P(T). 


In the sort of case Glymour cites, we have some evidence E that is already 
known when the question of confirmation is raised, i.e., P(E) = 1. In that 
case, it can be proved that no matter what T is, 


P(T/E) = P(T). 


It follows from the above criterion that E cannot confirm T at this time. 
But this contradicts our intuitive conviction that E can (and does) provide 
confirmation of T. 

But why is this a problem for Bayesianism? On the one hand, there is an 
intuitive notion of evidential support or confirmation; on the other, there 
is the technical notion of confirmation used by the Bayesians. Bayesians 
hold that the technical notion somehow corresponds to the intuitive; but 
surely no one maintains that the technical notion in no way differs from the 
intuitive. So why not say, in response to Glymour, that the cases he cites 
just show what no one doubted in the first place, viz. that the technical 
notion differs from the intuitive? There are several answers that could be 
given here. One answer is to be found in classical Bayesian kinematics, 
which supposedly govern all rational changes in subjective probability. (I 
should add that the obtaining of new information through sense experience 
or observation is not considered to be a ‘rational change of subjective 
probability’.) The old evidence cases Glymour cites cannot be dismissed 
by the classical Bayesian because these are cases in which it seems clear that 
the old evidence should result in an increase in a reasonable person’s 
confidence in the new theories in question, when such an increase in con- 
fidence is not warranted by the classical Bayesian’s Rule of Condition- 
alisation. From the classical Bayesian point of view, it would seem that 
there shouldn’t be any such increase in confidence in the new theory 
because, apparently, nothing new is learned. 


! When the agent’s degrees of belief change in accordance with the above rule, he is said to 
‘conditionalise on E’. A ‘Dutch book’ argument in favor of acting in accordance with this 
rule is presented and discussed in Teller [1976]. For a more general discussion of thia 
rule, as well as an excellent presentation of the principal features of Bayesianism, see the 
introductory chapters of Eells [1983]. 


Some Problems for Bayesian Confirmation Theory 553 


There have been many and varied responses by Bayesians to meet this 
difficulty, a complete discussion of which would be out of keeping with the 
aims of this paper. However, I shall discuss in what follows two kinds of 
responses to this problem that Glymour has anticipated. 


3 THE COUNTERFACTUAL RESPONSE 


Some Bayesians have argued that the apparent problem arises from a mis- 
application of the Bayesian analysis. It has recently been claimed, for 
example, that the absurd conclusion (that the support furnished by the pre- 
viously obtained data e for hypothesis h must be zero) is engendered by 
taking the wrong stock of background information in terms of which the 
probabilities mentioned in the Rule of Conditionalisation are computed. 
Taking K to be what our background information is at the time we assess 
the support the old evidence e furnishes hypothesis A, the counter-intuitive 
conclusion ‘arose from relativising the probabilities in the support defi- 
nition to the entire set K . . . including e, whereas they should be relativised 
to K—{e}’. The justification given for this claim is that ‘the Bayesian 
assesses support by how your odds on k would change were you now coun- 
terfactually to come to know e’.' 

Although there may be much to recommend this response, I do not find 
it entirely satisfactory. First note that K —{e} will probably contain many 
propositions that imply e. For example, such propositions as p & e and 
-e —> -g, where p and g are in K, will no doubt be in K. Thus, to avoid 
absurdity, it will not be enough to simply delete e from K: one must also 
delete all propositions that imply e. But what else should we delete? The 
problem is to specify in some reasonably precise manner just what this 
stock of background beliefs is to contain. 

Howson attempts to meet this problem by declaring that we should delete 
from K ‘everything in K dependent on’ e ([1984], p. 246, fn. 1). But what 
does that mean? In particular, what does ‘dependent on’ mean? Are we to 
delete all propositions in K that are logically dependent on e so that any 
truth-functional proposition in which e occurs as a component is to be 
deleted? But that clearly will not be sufficient to yield a stock of background 
beliefs of the sort Howson wants. For there may be many propositions in 
K that are logicaly independent of e (in this sense) but which make e very 
probable just the same. : 

Consider the following example. Suppose that some experimental result 
E has been available and widely discussed for many years but that it has 
not been realised that E confirms a sophisticated and complex biological 
theory T. When E is seen to confirm T, we find ourselves enmeshed in the 
old evidence problem. Now even if we delete all propositions that are logi- 


1 Howson [1985], p. 307. Cf. also Howson [1984]. This type of response to the old evidence 
problem has been made by others, e.g., Horwich [1982]. For different kinds of responses, 
see Niiniluoto [1983], Good [1968] and [1985], and Cambell and Vinci [1983]. 


554 Charles S. Chihara 


cally dependent on E from the stock K of background information we now 
have, there will still remain in K much that would not have been in K had 
we not made the observations or performed the experiments that gave rise 
to the information E; and some of these remaining propositions could render 
E extremely probable. That the very cautious and competent researcher A 
is sure of E is one example. Now it might be replied that the proposition 
about A may not be logically dependent on Æ, but it surely is causally 
dependent on (in some sense) the observations or experiments that gave 
rise to our coming to know E, for had these experiments not been per- 
formed, surely A would not have been so sure of E. Perhaps Howson 
would want to delete from K everything that is causally dependent on the 
observations and/or experiments that gave rise to our coming to know E. 
But this would make coming up with a specific stock of background beliefs 
an extremely messy and difficult task. Besides, it is not obvious that even 
deleting all the above propositions would yield that kind of stock of back- 
ground beliefs Howson wants. For the experiments that gave rise to E may 
have affected in countless and practically incalculable ways the degrees of 
belief we have in many other propositions. So in deleting E from our stock 
of background beliefs, do we also attempt to calculate what our degrees of 
belief in the other propositions would have been had we not performed 
the relevant experiments or obtained the information E? And if so, how 
do we make this calculation? The point is, if this sort of defence of Bayes~ 
ianism is to work, one needs a reasonably clear procedure or rule for 
coming up with the required stock of background information. But this, 
Howson has not given us. Unless and until this problem is dealt with, I 
cannot see how the above suggested defence of Bayesianism is entirely 
satisfactory. 

Closely related to the above is a second reason why I am not entirely 
satisfied by the above response: it does not obviate the difficulty the old- 
evidence problem poses for Bayesian kinematics. For we seem to have a 
case of rational change of subjective probability that is not warranted by 
the classical Rule of Conditionalisation. The solution being proposed is 
directed at showing that the old evidence does constitute support by claim- 
ing that support should be assessed in terms of counterfactual beliefs; but 
it doesn’t address the question of how classical Bayesian kinematics warrant 
the intuitively reasonable rise in subjective probabilities. Indeed, if any- 
thing, it seems to suggest that rational change of subjective probability 
should take place not in accordance with the classical Rule of Con- 
ditionalisation, but rather in accordance with a new rule that makes use of 
counterfactual subjective probabilities. It should be mentioned that How- 
son is not at all concerned with defending Bayesian kinematics, claiming 
that Bayesian kinematics are not part of Bayesian theory ([1984], p. 246), 
and another philosopher who advocates the counterfactual response ex- 
plicitly rejects the classical Rule of Conditionalisation (Horwich [1982], 
p- 32). 


Some Problems for Bayesian Confirmation Theory 555 
4 THE GARBER-EELLS RESPONSE 


As Garber analyses (in [1983]) the cases of old evidence presented by 
Glymour, there is something that is learned when the old already gathered 
data E is found to provide evidence for a new theory T.' What is learned is 
not E, since E was already known—but one does learn that T is linked 
logically to E in some appropriate way, say, by the relation of implication. 
Might not the boost in confidence in T, which is taken to be provided by 
the old evidence, in fact be produced by the discovery that T implies Æ? 

Unfortunately, such a response is not open to the classical Bayesian, for 
it is maintained that a rational agent’s degrees of belief, P(x), must be 
distributed so as to satisfy the axioms of the probability calculus. In par- 
ticular, 


~[x] If A is a logical truth, then P(A) = 1 
[2] If A and B are logically incompatible, then 
P(A v B) = P(A)+P(B) 


So the classical Bayesian’s rational agent is held to already know all the 
logical relationships that obtain and hence cannot be said to discover that 
T implies E. In other words, if one is to be a Bayesian rational agent (i.e., 
an agent with a coherent subjective probability distribution), one must be 
logically omniscient—something that is completely unrealistic when one is 
dealing with real human beings. So Garber suggest that we opt for a form 
of Bayesianism that requires less logical knowledge on the part of the 
agent than is required by the classical view. The details of Garber’s ‘local 
Bayesianism’ need not detain us here. The basic idea is this: Given a 
distinction between ‘local’ and ‘nonlocal’ logical truths, the rational locally- 
Bayesian agent is required to be logically omniscient with respect to only 
the local logical truths. As an example, Garber specifies a language, Le, 
containing atomic sentences and truth functional combinations of these 
atomic sentences. So the logical truths of L* are just the tautologies built 
out of atomic sentences. To be a rational locally-Bayesian agent in this case, 
one must have degree of belief 1 in all tautologies of L». In addition, for 
all propositions A and B, if A and B are locally logically incompatible, 1.e., 
the conjunction of A and B in L» is truth functionally inconsistent, then 
one’s degree of belief in (4 v B) must equal one’s degree of belief in A 
plus one’s degree of belief in B. The locally-Bayesian agent is thus required 


1 It should be emphasised that I give only the main idea underlying Garber’s response to 
Glymour'’s problem; there is much more to be found in the above paper. Jeffrey [1982] 
provides additional relevant discussion. Indeed, it would probably be more appropriate to 
speak of the ‘Garber—Jeffrey—Eells solution’ instead of the ‘Garber—Eells solution’ as I do in 
this paper. But to do so would require some discussion of Jeffrey’s contribution, and this 
would require lengthening the introductory section of this paper—something that I wished 
to avoid. The reader is encouraged to consult Eells [forthcoming] for more discussion of the 
above two papers. 


556 Charles S. Chihara 


to be logically omniscient with respect to only the tautologies of Læ». This 
allows the possibility that the rational locally-Bayesian agent may not know 
some such logical fact as that T implies or explains E. Thus, to return to 
the case in which some old evidence is found to support a new theory T, 
something new is learned that gives a boost to the subjective probability of 
T: it is the nonlocal logical truth that T implies or explains E that is learned. 

Eells (in [forthcoming]) is sympathetic with Garber’s attempt to produce 
amore realistic version of Bayesianism. And he agrees with the basic outline 
of Garber’s proposal. But he disagrees with Garber on how the line between 
local and nonlocal logical truths should be drawn. Commenting on Garber’s 
example in which the locally-Bayesian agent is required to know all taut- 
ologies, Eells points out the unreasonableness of the position: 


for there are extremely complex tautologies . . . so complex that it would be more 
difficult to recognize them as logically true than it would be to recognize as logically 
true certain simple sentences that are logically true in virtue of their (say) quanti- 
ficational logical form. 


The objection to this example is generalised to apply to any such proposal: 


There will always be extremely complex logically true sentences of the local langu- 
age, and extremely simple logically true sentences “outside” the local language, 
where it will be inappropriate to insist on probability 1 for the former while not so 
insisting in the case of the latter. 


Eells does not attempt to replace Garber’s line between local and nonlocal 
logical truths with a more suitable one; but he does suggest that the appro- 
priate place to draw this line would have to be determined somehow in 
terms of the amount of complexity of the logical truths. 


5 A NEW PROBLEM FOR BAYESIAN CONFIRMATION THEORY 


I now shall sketch a problem for Bayesian confirmation theory that is not 
adequately resolved by limiting the logical omniscience of the agent in the 
way described above. This kind of case hinges on another respect in which 
Bayesianism is unrealistic. To illustrate what I have in mind, I provide the 
reader with the following tale of courage and adventure. 


A young prince has fallen in love with a beautiful princess from a neigh- 
bouring land, the king of which is none too eager to see his favourite 
daughter carried off. To make matters worse for the prince, the king fancies 
himself to be clever—indeed, cleverer than all the young men who come to 
court his daughter. Now it is known that each of the princess’s suitors is 
always given some sort of formidable task to perform in order to prove 
himself worthy. Failure means being the star at the next ‘hungry lion show’. 
Success however would bring not only the princess’s hand, but also half 
the kingdom. Undaunted by the dangers involved, the prince accepts the 
challenge. His task? He is to be locked in a scientific laboratory for a week, 
during which time he is to be given, on the morning of each day of the 


Some Problems for Bayesian Confirmation Theory 557 


week, an object containing a number. On the morning of the seventh day, 
he is to guess the final number. If he guesses correctly, he passes the test. 
Otherwise, the lion pit. 

On the first day, the prince receives a sumptuous breakfast in the labora- 
tory. On the breakfast tray, he finds something that looks like a red ping- 
pong ball. Holding it up to the light, he can see that something is inside, 
but he decides to study the object before cutting it open. Forgoing the food, 
he goes to work immediately. He determines the ball’s precise weight, size, 
and density, hoping to find some information that might provide a clue as 
to the number that is contained. Is the particular shade of red significant? 
It is crimson. How about its odour? There is something vaguely familiar 
about it, but the prince is unable to place it. Finally, he cuts open the ball 
and finds the number 47. The prince searches his memory in an attempt 
to correlate the number with something significant, but nothing seems 
especially promising. After a day of fruitless speculation, he gets the second 
ball. It seems to be no different from the first, and the prince begins to 
wonder if the properties of the balls are at all relevant to the numbers. 
Perhaps it would be better to concentrate on the numbers themselves. The 
second number turns out to be 77. What do the two numbers have in 
common? They are both two digit, odd and have seven as the second digit. 
He continues his scientific researches, analysing the material out of which 
the ball is made. Day three brings little that is new. The third number is 
59. And no relationship is found between the numbers he gets and the 
properties of the balls. He decides to concentrate on just the mathematical 
properties of the sequence of numbers he has received. He begins to spend 
an increasing amount of time at the computer getting more and more 
complicated alogorithms for generating the sequence he has gotten thus 
far. Days four, five, and six pass without any significant new thoughts. All 
the balls seem to be the same. The numbers are all different, but the only 
characteristics they seem to share are these: they are all two digit and they 
are all odd. But on the basis of what he has learned thus far, the prince has 
no idea of what he should guess. There are ninety possible two digit 
numbers and if it is assumed that no number will be repeated in the 
sequence (which may be a dangerous assumption), there are still eighty-four 
two digit numbers from which to choose. And since these all seem equally 
probable to him, his odds of guessing the correct one are not very encour- 
aging. His training in Bayesian theory does not seem to be especially useful 
in this situation: conditionalisation on the successive numbers produces little 
change on the original probabilities assigned. Time is fast running out, and 
the prince is beginning to get desperate. His image of life with the princess 
is fading. He finds himself increasingly thinking of pain, suffering, and 
worst of all, how his princely white uniform will look when shredded and 
spotted with blood. ‘It will be crimson and white,’ he says to himself, ‘the 
colours of the king’s flag.’ And then, something clicks in his head. Thoughts 
come rushing into his mind: 


558 Charles S. Chthara 


The balls are crimson! ‘Crimson’ has seven letters. There are seven days in a week. 
Could the numbers be Gédel numbers spelling out ‘crimson’? The king’s logician 
has written a logic book. I saw it on the bookshelf. Here it is. Let’s see: what is 47 
the Gödel number of? Eureka! It’s ‘c’. I'll bet the seventh letter will be the Gödel 
number of ‘n’, that is, 69. 


There is no need to complete this story, which like all good fairy tales ends 
with the line ‘and they lived happily ever after’. For our purposes, enough 
has been said. As the classical Bayesian pictures situations of this type, the 
agent has a definite degree of belief in each proposition that is relevant. 
Changes of degrees of belief occur when the agent gets new knowledge. 
These changes take place in accordance with the Rule of Conditionalisation. 
But in real situations, real agents consider only a tiny fragment of the set 
of all propositions that are relevant. This is not because real agents are 
careless or not sufficiently thorough, but because it is simply not humanly 
possible to think of all the possibilities. What is clear in this example is that 
the prince’s subjective probabilities undergo considerable change as a result 
of hitting upon the hypothesis that the numbers are Gédel numbers of 
letters. In particular, the prince’s degree of belief in the hypothesis that the 
seventh number will be 69 makes a significant jump. 

What needs investigating is just how this change comes about. Is it the 
case that the increase is due to the discovery of some new matter of fact? 
But the prince was given no new data. All the data had been gathered hours 
earlier. In that case, perhaps the Garber—Eells solution can be applied. 
Thus, let H be the hypothesis that the sequence of numbers is a sequence 
of Gédel numbers that spell out the word ‘crimson’; and let E be the 
evidential statement that the first number is 47, the second 77, the third 
59,.-., the sixth 71. Might it not be the discovery of the logical fact that 
H implies or explains Æ (relative to the table of Gödel numbers in the king’s 
logician’s logic book) that accounts for the change in the prince’s degree of 
belief that the seventh number will be 69? I find this alternative implausible. 
The above suggests that the prince already had the hypothesis H in his 
head and only needed to learn that H implies or explains E in order for 
the changes in subjective probability to occur. But in fact, nothing like 
hypothesis H had entered the prince’s head before the inspiration came. 

But there are other reasons for doubting that mere conditionalisation on 
the logical facts discovered accounts for all the changes in degrees of belief 
that take place after the prince thinks of H. Notice that, in the above 
situation, discovering that 47 is the Godel number of ‘c’ results in a large 
jump in the subjective probability of the proposition that the last ball will 
be the Gödel number of ‘n’. But suppose that the prince had never thought 
of H. Then noticing that 47 is the Gödel number of ‘c’ might easily have 
had little or no effect on the subjective probability that the last ball will be 
the Gödel number of ‘n’. Why ‘n’? Why not ‘d’? Or any other letter? And 
why another Gödel number? Indeed, even discovering that 77 is the Gödel 


Some Problems for Bayesian Confirmation Theory 559 


number of ‘r’ might not have significantly altered the prince’s degree of 
belief that the last ball will be the Godel number of ‘n’, for he might not 
have noticed the pattern emerging. It seems clear from examples such as 
the above that just thinking of new possibilities will produce in rational 
people important redistributions of subjective probability. One way of 
analysing the above situation is this: The degrees of belief one has in a 
proposition is a function of what one sees as the possibilities. In thinking 
up new possibilities, one alters the space of alternatives being considered. 
This alternative, in turn, produces changes in the distribution of subjective 
probabilities. In verifying that H does imply E, the prince’s confidence in 
H goes up. This part of the story can be explained by the Garber—Eells 
analysis. But not all the changes in subjective probability that took place 
can be attributed to the discovery that H implies E. 

Avid readers of detective stories will undoubtedly be able to think of 
many concrete examples of this phenomenon. In typical ‘Who done it?’ 
mysteries, the reader is given, in addition to an abundance of clues and 
suspects, much testimony surrounding the death of one or more characters. 
After pondering an especially puzzling mystery, sometimes just thinking 
of some new possibility will produce a radical change in one’s evaluation 
of the likelihood that some character is the murderer. For example, one 
may have taken without question the service station attendant’s statement 
that the person driving the victim’s car was a woman. Then, it may occur 
to the reader that the driver might have been a man dressed as a woman. 
Just such a thought could result in some marked changes in subjective 
probabilities.’ 

Finally, let us consider a sort of ad hominem objection to what I say in 
this paper.” In a previous paper, I advocated analysing the confirmational 
paradoxes from a Bayesian point of view. Indeed, I even put forward 
solutions to these paradoxes from this perspective (in [1981]). But now I 
seem to be suggesting that the Bayesian theory of confirmation is completely 
unrealistic. How can these different positions be reconciled? In this paper, 
I have been focusing on two respects in which real people are not Bayesian 
agents: it is obvious that no one is logically omniscient; it is equally obvious 
that no one can think of all the possible hypotheses that explain or imply a 
set of data. However, a theory of confirmation that makes unrealistic 


' To some extent, the sort of problem for Bayesian confirmation theory I raise here was 
anticipated. Cf. Teller’s comment (in [1975], pp. 173-4), that 


[I]t is surely overoptimistic to suppose that men could, following Carnap’s prescription, 
once and for all settle on a perfectly general prior probability function which henceforth 
need only change by conditionalization. From time to time we seem to be in need of revising 
our opinion about the confirmation relation between evidence and hypotheses. This is made 
possible by Shimony’s localized investigation with their freshly assigned prior probabilities, 
which give scientific investigations “‘greater openness to the contingencies of inquiry than 
possessed by Carnap’s c-functions’’.’ 

2 This objection was raised in Paris at le Centre de Recherche sur l’epistemologie et l’ autonomie 
by Pascal Engel, where a version of this paper was presented in January of 1986. 


5360 Charles S. Chihara 


assumptions about rational agents may still turn out to be extremely enlight- 
ening when used to analyse actual scientific reasoning; it depends, of course, 
on just how the unrealistic assumptions enter into the analyses. In the 
case of the paradoxes, logical omniscience can be regarded as a relatively 
harmless simplifying idealisation, and since we do not have the introduction 
of new ideas, new theories and new hypotheses into the situations discussed 
in the paradoxes, which might produce the kind of disruption of the sub- 
jective probability space discussed in this paper, the second of the un- 
realistic assumptions does not seem to be unreasonable.! 


University of California, Berkeley 


REFERENCES 


CHIHARA, CHARLES [1981]: ‘Quine and the Confirmational Paradoxes’, in P. French, T. 
UVehling, Jr and H. Wettstein (eds.), Midwest Studies in Philosophy, 6, pp. 425-52. 

CAMBELL, R. and VINCI, T. [1983]: ‘Novel Confirmation’, The British Journal for the Phil- 
osophy of Science, 34, pp. 315-41. 

EELLS, ELLERY [1982]: Rational Decision and Causality. Cambridge University Press. 

EELLS, ELLERY [forthcoming]: ‘Problems of Old Evidence’, Pacific Philosophical Quarterly. 

GARBER, DAVID [1983]: ‘Old Evidence and Logical Omniscience in Bayesian Confirmation 
Theory’, in John Earman (ed.), Minnesota Studies in the Philosophy of Science, 10, pp. 99- 
131. 

GLYMOUR, C. [1980]: Theory and Evidence. Princeton University Press. 

Goop, I. J. [1968]: ‘Corroboration, Explanation, Evolving Probability, Simplicity, and a 
Sharpened Razor’, The British Journal for the Philosphy of Science, 19, pp. 123-43. 

Goop, I. J. [1985]: ‘A Historical Comment Concerning Novel Confirmation’, The British 
Journal for the Philosophy of Science, 36, pp. 184-5. 

Horwicu, P. [1982]: Probability and Evidence. Cambridge University Press. 

Howson, C. [1984]: ‘Bayesianism and Support by Novel Facts’, The British Journal for the 
Philosophy of Science, 35, pp. 245-51. 

Howson, C. [1985]: ‘Some Recent Objections to the Bayesian Theory of Support’, The British 
Journal for the Philosophy of Science, 36, pp. 305-9. 

JEFFREY, RICHARD [1983]: ‘Bayesianism with a Human Face’, In John Earman (ed.), Minnesota 
Studies in the Philosophy of Science, 10, pp. 133-56. University of Minnesota Press. 
NUNILUOTO, I. [1983]: ‘Novel Facts and Bayesianism’, The British Journal for the Philosophy 

of Science, 34, pp. 375-9 - 

TELLER, PAUL [1975]: ‘Shimoney’s A Priori Argument for Tempered Personalism’, in G. 
Maxwell and R. Anderson (eds.), Minnesota Studies in the Philosophy of Science, 6, 
pp. 166-203. University of Minnesota Press. 

‘TELLER, PAUL [1976]: ‘Conditionalization, Observation, and Change of Preference’, in W. 
Harper and C. A. Hooker (eds.), Foundations of Probability Theory, Statistical Inference, 
and Statistical Theories of Science, vol. 1. D. Reidel. 


! The research for this paper was supported by a Fellowship for Independent Study and Research 
from the National Endowment for the Humanities. I am grateful to Ellery Eells, Paul Teller, 
Brian Skyrms and an anonymous referee for their helpful comments on an earlier version 
of this paper. 


Brit. J. Phil. Sci. 38 (1987), 561-571 Printed in Great Britain 561 


Discussions 


‘GENES’ AMPLIFIED 


Kitcher [1982] has elaborated a theory of conceptual change based on the 
preservation of reference between classical or ‘transmission’ genetics and 
molecular biology. The approach is based on a common understanding of 
terminology between these two disciplines and the specific illustration used 
is the term ‘gene’. The basic position is that ‘gene’ may be used to refer to 
different entities at different times and in different contexts, but that such 
shifts in reference do not do significant damage to scientific communication 
as long as the parties involved understand what each means in their use of 
the term ‘gene’. Kitcher suggests that this stratagem will allow one to 
‘transcend’ the debate about reduction between transmission genetics and 
molecular biology, which can be interpreted to mean that it could be used 
to dispense with reduction as a strategy in discussing conceptual change 
between these two disciplines. 

As an extension of Kitcher’s argument, I wish to emphasise the import- 
ance of the fact that ‘gene’ can mean more than one thing at a time. Biologists 
often employ different concepts of the gene at the same time and address different 
concepts of the gene simultaneously in the same experiment. Biologists have 
internalised a great many concepts of ‘gene’ and can restrict the gene 
concepts they discuss or employ in a particular context. This restriction 
may occur without explicit elaboration and is based upon contextual cues 
and a shared knowledge of the relevant experiments in genetics and molec- 
ular biology. The explicit realisation that there are some experimental 
protocols which require the simultaneous use of more than one concept of 
‘gene’, has, as will be shown, the consequence of eliminating reduction as 
a necessary strategy for explaining the relationship between genetics and 
molecular biology. 

As an example, the experiments designed to develop methods for the 
detection of inherited disease (Bhattacharya et al. [1984], Gusella et al. 
[1984], Woo et al. [1983]) are of fundamental importance in biology and in 
medicine and are successful only because of the juxtaposition of concepts. 
Genetic diseases were the first sort of phenotype for which it was suggested 
that a biochemical reaction might be responsible, an ‘inborn error of meta- 
bolism’ (Garrod [1902]). There are many serious genetic defects for which 
the biochemical lesion, at the level of an enzymatic reaction, is still 
unknown. For these diseases, it has been impossible to determine whether 
a fetus will develop the disease or whether an adult is a heterozygous carrier 
for the disease. A new technique is being exploited which totally sidesteps 





562 Muriel Lederman 


the need to know anything about the nature or physiological action of the 
gene products which are ‘mis-produced’ (in whatever way) to cause the 
disease. It is based on the ability to analyse in a simple fashion the DNA 
sequence at a site which is closely linked to, but usually outside of, the gene 
which causes the disease. 

The technological advances which are the basis of this methodology are 
the availability of restriction endonucleases, enzymes which break DNA 
into fragments of different sizes by cleaving the DNA at a nucleotide 
sequence characteristic of the enzyme, and the existence of a ‘library’ of 
human DNA, each ‘volume’ of which is a clone containing an individual 
restriction endonuclease-generated fragment in an easily manipulated form. 
In the diagnostic test, DNA from normal and disease-bearing individuals 
is digested with a restriction endonuclease and the fragments produced are 
separated in an electric field on the basis of size. After a hybridisation 
reaction with a radioactively labelled clone from the library, the fragment 
which contains the same nucleotide sequence as that present in the clone is 
now radioactively labelled and its size can be determined. If this type of 
test is successful, the size of the hybridisable fragment derived from a 
normal individual will be different from that derived from a disease-bearing 
donor. This difference (restriction fragment length polymorphism) 
becomes a marker for the disease (Botstein et al. [1980]). The polymorphism 
arises as a result of a change in the nucleotide sequence which is associated 
in the disease state with the gain or loss of a recognition sequence for the 
restriction endonuclease.! 

In the above example two different concepts of the gene are being 
addressed simultaneously. They are: (1) the concept of the gene as the 
determinant of a phenotype (presence or absence of a disease), and (2) a 
new, totally structural concept of the gene in which the phenotype is the 
same as the genotype. All previous concepts of gene have had, at their basis, 
a functional component based on gene activity. The marker for the allelic 
state of the gene may have been structural (presence of a normal versus an 
altered body part), since it was (and still is) accepted that the altered 
structure was the result of gene activity during the process of development. 
Now, a change in the nucleotide sequence of DNA (a change in the geno- 
type) is reflected in the change of the size of a piece of DNA, which in and 
of itself has no functional component. The closeness of phenotype and 
genotype in this situation is shown by the terminology used by human 
geneticists. They speak of e.g., ‘the 4.2 kb allele’ to correlate the presence 


1 It should be understood that a restriction fragment length polymorphism is a predictor of 
the disease state rather than a foolproof diagnostic tool. ‘The presence of the polymorphism 
may be masked by a recombination event in an individual. As well, some polymorphiams 
are detected at sites within the gene, as with phenylketonuria. The cloned fragment which 
is used in the hybridisation reaction may contain the gene whose mutation causes the disease 
or it may be ‘anonymous’: nothing may be known about the DNA contained in the clone, 
other than the size of the fragment to which it hybridizes is different in normal and disease- 
bearing individuals. 


‘Genes’ Amplified 563 


of the disease state with the size of the fragment in kilobases. The allelic 
state, t.¢e., the genotype, is defined by the size of a piece of DNA. In at least 
one case, the size of the fragment is also the phenotype. Heterozygotes for 
some genetic diseases appear normal and their phenotype can only be 
detected by the presence of a restriction fragment length polymorphism. 

Another dimension to these concepts is that the first is historically ‘old’ and 
derived from Mendelian genetics; the second is historically ‘new’ and can 
` be considered a molecular biological concept. If the practice of biology is 
to have relevance to the philosophy of biology, the experiment described 
above should address the possibility of reduction between ‘transmission 
genetics’ and molecular biology because of the juxtaposition of concepts of 
‘gene’ from these two disciplines in one experiment. Concepts of ‘gene’ can 
be envisioned as discrete conceptual entities arranged on a crude historical 
continuum. Since the trend is chronologically toward molecular biological 
concepts, the tendency has been to suggest that genetics has been reduced 
to molecular biology. Reduction seems to be a linear, vectorial function: 
science ‘a’ being reduced to science ‘b’ being reduced to science ‘c’. If it 
can be shown that individual concepts of ‘gene’ are discrete, and that they 
remain in the vocabulary of the science in the ‘unreduced’ form after 
the development of new concepts, then the experiment described above 
demonstrates that the continuum between genetics and molecular biology 
can be circularised. 

To show that concepts of ‘gene’ are discrete, two arguments will be 
presented. The first is based on the nature of the change between concepts. 
All of the concepts of ‘gene’ historically earlier than the concept exemplified 
by restriction fragment length polymorphisms shared two common charac- 
teristics: genotype and phenotype were separable, and the gene was analysed 
within the physical limits of that portion of the genome which coded for 
the protein specified by the gene. For restriction fragment length poly- 
morphisms, the genotype and the phenotype are merged and the allelic 
state of the gene can be monitored at a site outside of the physical limits of 
the gene. From the point of view of the biologist, this is a radical break 
with previous concepts of ‘gene’ and shows that, at least at one site on the 
continuum, concepts can be delimited. 

For the second argument, let us assume that the continuum of concepts 
is seamless. This would require a biological basis for the continuity and the 
most likely candidate would be the nucleotide sequence of DNA. This 
assumption is based on the argument that the basis of heredity is DNA and 
has been so throughout at least the history of genetics, even though it 
was not always recognised as such. However, any given stretch of DNA 
is meaningless without reference to concepts of ‘gene’. Figure 1A shows a 
hypothetical nucleotide sequence. Figure 1B shows the same sequence with 
the portion coding for protein broken into codons with the abbreviations 
for the amino acids indicated. Figure 1C includes the above information as 
well as indicating by underlining additional regulatory signals for initiation 


564 Muriel Lederman 


A: TATAAAGAAAGAACCAGGAAGTCAAAAATGCCAAAAAAGAAGAGAAAGGTATAACC 
AACCAACTATGTTTCTCTGTTTGGAATAAA 


B: TATAAAGAAAGAACCAGGAAGTCAAAA ATG CCA AAA AAG AAG AGA AAG GTA TAA 
met pro lys lys lys a lys 


CCAAAACAACTATGTTTCTCTGTTTGGAATAAA 


C: TATAAAGAAAGAACCAGGAAGTCAAAA ATG CCA AAA AAG AAG AGA AAG GTA TTA 
met pro tys lys lys amg lys val 


CCAAACCAACTATGTTTCTCTGTITGGAATAAA 


Figure 1. Hypothetical nucleotide sequence and its relation to concepts of ‘gene’, A: Nucleotide 

sequence. B: Nucleotide sequence with codons displayed and translated. C: As B: with 

promoter for transcription (TATAA), codon for initiation (ATG) and termination (TAA) of 
translation and polyadenylation signal (AATAAA) underlined. 


and termination of transcription/translation. When an investigator exam- 
ines a DNA sequence, he or she immediately begins to look for landmarks 
such as these and then displays the sequence in a format which highlights 
the biological signals, which are components of different concepts of ‘gene’. 
Some of these signals have been identified by examining many different 
genes and finding common or ‘consensus’ sequences from initially mean- 
ingless tracts of nucleotides; these were later shown to serve a common 
function for the genes with which they are associated. If none of these signals 
is present, the sequence cannot be placed in context and is meaningless. To 
return to the original argument, DNA sequence cannot serve as a common 
thread between gene concepts if it is meaningful only in light of these 
concepts. 

The introduction of newer gene concepts has not resulted in discarding 
older concepts. A new or novel concept of gene emerges and is different 
from previous concepts when it changes the perception of biologists about 
(a) the nature of the genetic material, (b) the organisation of the genetic 
material, (c) the way the genetic material is expressed, in ways that can be 
experimentally exploited. 'The experimental statement of a new concept is a 
new genotype. To test the new concept, the phenotype associated with a 
pre-existing concept is uncoupled and used as the end-point of an experi- 
ment. The application of the phenotype changes without altering the con- 
cept to which it was originally coupled or destroying the validity of the 
original coupling.' Clearly part of the current utility of older concepts of 
‘gene’ is the phenotypes to which they are coupled. 

The example and analysis given above show that conceptual connections 
between genetics and molecular-biology can be circularised in at least one 
way and suggests that within the context of a given experiment, no 
conjunction of concepts of ‘gene’ need a priori be excluded. Since this is the 


' By analogy, on some word processing programs a file can be copied leaving the original file 
and file name intact. 


‘Genes’ Amplified 565 


case, reduction is not necessary to explain the relationship between concepts 
in genetics and molecular biology since they are related in a non-linear, 
non-directional, non-predictable manner. 

This type of analysis may hold for other PAE ERRI as well, e.g., between 
classical and modern physics and/or classical and modern chemistry. 
Eventually, concepts in cell biology and developmental biology may be 
used in the same experiment with concepts from genetics and/or molecular 
biology. As this occurs, it will obviate the need for a reductive relationship 
between development or cell biology and molecular biology.' 

The dispute over whether or not genetics can be reduced to molecular 
biology has been going on for many years. The present contribution may 
not alter long-held views but, if nothing else, perhaps it has added two new 
facets to the discussion. One is the irrelevance of reduction to explain the 
relation between genetics and molecular biology. The actual irrelevance is 
clear to a working scientist and has been noted previously, e.g., by Schaffner 
[1974]. The second is the notion that the detailed examination of the fine 
structure of scientific investigation itself can have an impact on philosophy 
of biology. 


ACKNOWLEDGEMENTS 


I appreciate exhaustive discussions with Richard M. Burian, which helped 
shape this analysis, as well as the cogent comments of Doris Teichler- 
Zallen, Duncan M. Porter and Susan Miller. I thank Joseph Pitt for intro- 
ducing me to philosophy of science. 


MURIEL LEDERMAN 

Department of Biology, Virginia Polytechnic Institute 
and 

State University, Blacksburg, VA 24061 


REFERENCES 


BHATTACHARYA, S. S., WRIGHT, A. F., CLAYTON, J. F., PRICE, W. H., PHILLIPS, C. L, 
McKeown, C. M., Jay, M., BIRD, A. C., PEARSON, P. L., SOUTHERN, E. M., Evans, H. 
J. [1984]: ‘Close Genetic Linkage between X-linked retinitis pigmentosa and a restriction 
fragment length polymorphism identified by recombinant DNA probe L1.28’, Nature, 
309, pp. 253-255. 

BoTsTEIN, D., WHITE, R. L., SKOLNICK, M. and Davis, R. W. [1980]: ‘Construction of a 
Genetic Linkage Map in Man Using Restriction Fragment Length Polymorphisms’, 
American Journal of Human Genetics, 32, pp. 314-331. 

Garrop, A. E. [1902]: ‘The Incidence of Alkaptonuria: a Study in Chemical Individuality’, 
The Lancet, ii, pp. 1616-1620. 

GURDON, J. B., FAIRMAN, S., MOHUN, T. J. and BRENNAN, S. [1985]: ‘Activation of Muscle 
Specific Actin Genes by an Induction between Animal and Vegetal Cells of a Blastula’, 
Cell, 41, pp. 913-922. 


' For examples of experiments which are relevant to the expansion of genetics and molecular 
biology to developmental biology, see Gurdon et al. [1985] and the characterisation of the 
homoeo box of Drosophila (McGinnis et al. [1984], Scott and Weiner [1984]). 


566 Muriel Lederman 


GuSsgELLA, J. F., Tanzi, R. E., ANDERSON, M. A., Hosss, W., GIBBONS, K., RASHTCHIAN, 
R., GILLIAN, T. C., WALLACE, M. R., WEXLER, N. S. and CONNEALLY, P. M. [1984]: 
‘DNA Markers for Nervous System Diseases’, Nature, 225, pp. 1320—1326. 

KITCHER, P. [1982]: ‘Genes’, The British Journal for the Philosophy of Science, 33, pp. 337- 
359- 

McGINNIS, W., GARBER, R. L., Wirz, J., KUROIWA , A. and GEBRING, W. J. [1984]: ‘A 
Homologous Protein-Coding Sequence in Drosophila Homeotic Genes and Its Con- 
servation in Other Metazoans’, Cell, 37, pp. 403-408. 

SCHAFFNER, K. F. [1974]: ‘The Peripherality of Reductionism in the Development of Molec- 
ular Biology’, Journal of the History of Biology, 7, pp. 111-139. 

Scorr, M. P. and WEINER, A. J. [1984]: ‘Structural Relationships Among Genes that Control 
Development: Sequence Homology between the Antennapedia, Ultrabithorax and fushi 
tarazu Loci of Drosophila’, Proceedings of the National Academy of Sciences, 81, pp. 4115— 
4119. 

Woo, S. L. C., LIDSKY, A. S., GUTTLER, F., CHANDRA, T. and Rosson, K. H. J. [1983]: 
‘Cloned Human Phenylalanine Hydroxylase Allows Prenatal Diagnosis and Carrier 

` Detection of Classical Phenylketonuria’, Nature, 306, pp. 151-155. 


IS SPIN A CONSEQUENCE OF RELATIVITY? 
A COMMENT ON MORRISON 


Recently in this journal Morrison [1986] claimed: ‘One further point of 
importance is that Dirac’s relativistic wave equation for the hydrogen atom 
coincided with the one obtained previously by Darwin.’ (Italics by me.) 
But according to Darwin [1928], his set of two wave functions is only an 
approximation to Dirac’s set of four wave functions from 1928. Darwin 
[1927] also emphasised that there is no difference between the mathematics 
of his work and Pauli’s non-relativistic theory. 

After these preparatory remarks let us consider the following claim of 
Morrison [1986]: ‘He (Dirac) succeeded in proving that the electron spin 
arises as a necessary result of the relativistic formulation of quantum mech- 
anics.’ This contradicts a statement of Hestenes & Gurtler [1971]: “The 
Dirac theory takes account of relativity, but contrary to a widespread 
opinion, it says nothing fundamental about spin that is not already in the 
Pauli theory.’ (See also Levi-Leblond [1974], Biedenharn [1983].) That is, 
spin is described more or less accurately also by wave equations that are 
not Lorentz invariant (and especially need not take into account relativistic 
mass variation); even SchrGdinger’s scalar theory takes care of spin to some 
extent, though in a less familiar way (H. & G.). 

Nevertheless one may suppose that for describing spin as the electron’s 
clock some portion of relativity theory is indispensable, that spin is based 
largely on a relativistic effect which occurs also at low velocities, namely 
relativity of simultaneity; this enters the Lorentz transformation equations 
through the factor c*/v that can be interpreted as the velocity of a wave 
serving synchronisation (especially self-synchronisation). But this idea 
would need further elaboration in a separate paper. 


JOACHIM VON PESCHKE 
Sonnenbiihlsir. 20, D 775 Konstanz 


Lawson on the Raven Paradox and Background Knowledge 567 
REFERENCES 


BIEDENHARN, L. C. [1983]: Found. Phys., 13, p. 13. 

Darwin, C. G. [1927]: Proc. Roy. Soc. London. A 116, p. 227; [1928]: Proc. Roy. Soc. London 
A 118, p. 654. 

HESTENES, D. and GurTLER, R. [1971]: Am. ¥. Phys., 39, p. 1028, introd. 

LEVI-LEBLOND, J. M. [1974]: Revista del Nuovo Cimento, 4, p. 99, esp. section 3.2 and 
appendix A. 

MORRISON, M. [1986]: Brit. Y. Phil. Sci., 37, p. 101, esp. p. 103. 


LAWSON ON THE RAVEN PARADOX AND BACKGROUND 
KNOWLEDGE 


In his [1985] Tony Lawson reconsiders the significance of the Raven Para- 
dox for confirmation theory. He takes off from Alan Musgrave’s [1974]. 
Musgrave had distinguished between purely logical and logico-historical 
approaches to confirmation. A purely logical approach assesses the con- 
firmation given by evidence e to hypothesis h purely in terms of the logical 
relations between e and k. A logico-historical approach introduces temporal 
or historical considerations. Musgrave commended my [1964] solution, 
which was along Popperian lines; e can corroborate h only if it was the 
result of a genuine test on h, and whether it was a genuine test depends on 
background knowledge k. But Musgrave then went on to spell out some 
serious difficulties with the notion of background knowledge. If construed 
subjectively, as what an individual scientist happens to know at a given 
time, confirmation becomes person-relative. If construed objectively, as 
everything that is ‘known to science’ at a given time, then evidence e that 
was in k but which had played no role in the construction of an impressive 
theory T cannot confirm T, even if T explains e in a natural and uncontrived 
way. In a well known example of this, e is evidence concerning the pre- 
cession of the perihelion of Mercury and T is the General Theory of 
Relativity. Zahar [1973] had proposed a solution to this problem, but 
Musgrave feared that this might reintroduce person-relativism. 

Lawson’s argument, as J understand it, runs like this: (1) This paradox 
can indeed be avoided only by giving a role to background knowledge. 
(2) However, (1) above does not tell in favour of ‘logico-historical’ and 
against ‘purely logical’ theories; on the contrary, it creates serious embar- 
rassments for the former, but not for the latter. (3) Therefore, to avoid 
this paradox one should opt for a ‘purely logical’ theory in which back- 
ground knowledge plays a role different from the one it plays in logico- 
historical theories. 

Apropos (1), Lawson recalls that Hempel had rejected any appeal to 
background knowledge as a way of eliminating his “paradoxical” or coun- 
ter-intuitive result; for Hempel regarded this as counter-intuitive only to 
untutored intuition; once we realise that ‘All ravens are black’ says some- 
thing, not about ravens only, but about all things, namely that each of 


568 John Watkins 


them is no-raven-or-black, we see that there is nothing amiss m its being 
confirmed by white shoes or whatever. But Hempel, Lawson says, was 
alone here in excluding background knowledge; all other accounts ‘involve 
the introduction of background knowledge’ (p. 395, n. 1); and rightly so in 
his eyes: ‘It seems clear that in order to obtain an intuitively acceptable 
solution to the paradox of confirmation the essential requirement is the 
introduction of background knowledge’ (p. 401). 

Apropos (2), Lawson is content, for the most part, to regurgitate the 
difficulties highlighted in Musgrave [1974]. (He alludes briefly to Zahar’s 
[1973] and Worrall’s [1978] attempts to free the idea of a novel fact from 
the requirement that it be temporally novel, but without saying whether he 
thinks that they have overcome the difficulty.) 

Apropos (3), Lawson’s proposal is that background knowledge will tell 
us, with respect to a given hypothesis, what is its field F of potential appli- 
cation, or what is the set F of things to which we will confine ourselves in 
testing and confirming (disconfirming) it. Given a universal hypothesis 
(Vx) [Ax > Bx], he proposes to reformulate it as (Wxe F) [Ax > Bx]. Here, 
F may be identical with A, or it may be somewhat wider and contain A as 
a proper subset. When the hypothesis is formulated contrapositively, it is 
still governed by this same quantifier. Thus if the hypothesis were ‘All 
ravens are black’, and F were either the set of all ravens or, say, the set of 
all birds, then white shoes and red herrings would not confirm it. 

Reading this stirred a memory: hadn’t Jerzy Giedymin once proposed 
something similar? After searching unsuccessfully, I rang him up. It turned 
out that he had indeed published something along these lines, but in Polish, 
in [1969]. He has kindly sent me his English translation of relevant passages, 
from which I will now quote an extract: 


If the testing of a hypothesis is relativised to a domain . . . and to background 
knowledge, then the quantifiers which occur in the hypothesis should similarly be 
relativised to the universe of the domain. . . . The denotations of the predicates (in 
the hypothesis) are then subsets of that universe and the same applies to their 
complements (i.e. ‘non-raven’ does not denote the set of all physical objects that 
are not ravens). For example, if the raven hypothesis is to be tested in a domain 
whose universe is the set of all birds, then the raven hypothesis and its equivalent 
by contraposition should be taken as the following statements: 


For every x in the set of birds, if x is a raven then x is black, 
For every x in the set of birds, if x is not black then x is not a raven. 


The quantifiers in the raven hypothesis need not be relativised to the set of 
all birds but may, instead, be relativised to a proper subset . . . to which both 
uncontroversial . .. and controversial ravens belong (p. 229). 


So Lawson is in good company. He objected that my [1964] account brought 
in ‘an extra assumption’ (p. 402). Rather than defend that superseded 
account, I would point out the revised (though still essentially Popperian) 
account of corroboration presented in my [1984] (which was not available 
to Lawson) dispenses with background knowledge in Popper’s sense, but still 


_ Lawson on the Raven Paradox and Background Knowledge 569 


avoids the Raven Paradox (and, I believe, the other well known paradoxes 
of confirmation). Background knowledge in Popper’s sense includes various 
general theories and assumptions. This is replaced, in my [1984] account, 
by the much simpler and less problematic idea of a historical record of tests. 
This latter makes mine again a ‘logico-historical’ rather than a ‘purely 
logical’ theory. 

I had come to regard background knowledge as an undesirable com- 
ponent in an objective and impersonal theory of corroboration for several 
reasons, including the difficulties highlighted by Musgrave. Another 
difficulty is this. Suppose, contrary-to-fact, that there were agreed criteria 
for what should be included in background knowledge. Popper spoke of ‘the 
vast amount of background knowledge which we constantly use’ ([1963], p. 
168); could one ever spell out everything that belongs in it? Surely not. If 
background knowledge k were allowed to entail either e or not-e, then p(e/k) 
could have a determinate value, of 1 or o, even though & had not been fully 
articulated. But in Popper’s use of k to measure the severity of a test, the 
test is severe if the difference between p(e/hk) and p(e/k) is large; and for 
this to obtain, k must entail neither e nor not-e; but in that case, how can 
p(e/k) have a determinate value if we cannot spell out everything that is in 
k? We might have overlooked something that bears significantly on e. 
Another consideration was that k had been called on, in Popper’s system, 
to play what was actually an inductive role in accounting for the diminishing 
severity of repetitions of the same kind of test on h, as was pointed out 
independently by Musgrave [1975] and O’Hear [1975]. 

I agree that my idea of a historical record of tests is a shade idealised. It 
is supposed to record, for each experiment hitherto made, what initial 
conditions were deliberately realised, what sorts of outcome were to be 
measured and with what degree of precision, and what, in the event, the 
actual outome was. No doubt, some scientific experiments have been per- 
formed of which no adequate record was preserved. But in contrast with 
“background knowledge”, we do at least know in principle what should be 
included in it. One can imagine an ideal scientific community in which 
information of the above kinds concerning all experimental tests is recorded 
in a central computer’s memory, enabling anyone who has derived some 
testable law-statement g from some theory T to ask the computer to check 
whether g would have been at risk from any previous test and, if so, what 
the result would have been. This idea permits us to classify tests on a theory 
as hard, medium, or soft; roughly, the test will be hard if either: (i) T is 
breaking new ground with this g, which would not have been at risk from 
any previous test in the historical record; or (ii) T is here challenging some 
rival theory which entails g’ which is a counterpart of g; and g diverges 
slightly from g’; and g’ has passed tests, but these tests would not have been 
stringent enough to discriminate between g and g’, had g been up for test 
when they were performed. A test is soft if the historical record shows that 
it is essentially a repetition of an earlier test that g would have passed. Thus 


570 John Watkins 


I no longer need anything like the difference between p(e/hk) and p(e/k) as 
a measure of the severity of tests. I will not repeat here my rebuttal (pp. 
296-7) of the objection that some inductivist assumption is involved in the 
above classification. 

It seems to me that some of the dangers attendant on the idea of ‘“‘back- 
ground knowledge” apply to Lawson’s proposal. Could there be objective 
criteria to determine just what should be included in his F? Could not 
different people draw the boundary differently, so that what are confirming 
instances for some are neutral instances for others? There would be no 
uncertainty if F were always required to be identical with A (the set of all 
ravens in the case of ‘All ravens are black’). But would that be desirable? 
Lawson himself wobbles a bit on this. At one place he writes: ‘Usually, I 
believe, F will be identical with A’ (p. 396); but later, and in a different 
context, he writes: ‘I am suggesting that in general F is not identical to A’ 
(p. 399). I say that there is a compelling reason why F must not be identical 
with A. Since as long ago as my [1960] I have been maintaining that 
an adequate theory of corroboration must leave open the possibility that 
evidence concerning things that turn out to be non-A may corroborate 
(Yx) [Ax > Bx]. (Lawson correctly reports me to this effect, p. 401.) Agassi, 
in [1959], had given the following example: ‘All freely falling bodies fall 
with constant acceleration’ is to be tested by monitoring the fall of steel 
balls released from an electromagnet at the top of a deep mineshaft. It is 
found that, during a segment of their path, the acceleration of these balls 
varies: the hypothesis appears to be refuted. Subsequent investigation 
reveals, however, that the mineshaft passes through a stratum of magnetic 
rock and that their passage through this stratum coincides with the balls’ 
inconstant acceleration. Thus they were not freely falling bodies; yet they 
may have provided rather striking corroboration for the hypothesis. 

If that point is well taken, the question arises, how far should F extend 
beyond A? I cannot myself envisage a non-arbitrary answer to this question; 
but suppose there is one. Suppose, roughly in line with Giedymin and 
Lawson, that in the case of ‘All ravens are black’ it is non-arbitrary to make 
F the set of all birds. We now reformulate this hypothesis contrapositively 
as WxeF)[~ Bx > ~ Ax]. Then what, on the Giedymin-Lawson account, 
is to stop me “‘confirming ” this hypothesis by going to a swannery and 
observing thousands of white swans? For each swan is a member of F and 
also both ~B and ~A. 

Having indicated how objects that are both ~ Band ~ A may corroborate 
(Vx) [Ax > Bx], perhaps I may, in conclusion, indicate how my [1984] 
theory endorses the thesis that objects that are both A and B may never- 
theless fail to corroborate it. Let us (with apologies to Francis Bacon) take 
the following as our hypothesis A: ‘All sailors who pay their vows escape 
shipwreck.’ Assume that evidence e lists many instances of sailors who did 
pay their vows and who did escape shipwreck, and no instance of a sailor 
who did not escape shipwreck having paid his vows. For Hempel, this e 


Lawson on the Raven Paradox and Background Knowledge 571 


assuredly confirms this h; and Lawson would presumably agree with this, 
since the: sailors listed in e must have been within hA’s field of potential 
application. But I say that we must first consult the historical record of 
tests and find out what initial conditions were deliberately realised when 
this evidence was gathered. Suppose we find that the investigators began 
by satisfying themselves, in each case, that the sailor in question had escaped 
shipwreck; only then did they inquire whether he had paid his vows. In that 
case, Å was at no risk from whatever results the investigation might lead to, 
and this e provides no corroboration for this A. 


JOHN WATKINS 
The London School of Economics 


REFERENCES 


AGASSI, J. [1959]: ‘Corroboration versus Induction’, British Journal for the Philosophy of 
Science, 9, pp. 311-17. 

HEMPEL, C. G. [1945]: ‘Studies in the Logic of Confirmation’, Mind, 54, pp. 1-26 and 97— 
121; reprinted in Hempel, C. G . [1965]: Aspects of Scientific Explanation. New York: 
Free Press. 

GIEDYMIN, J. [1969]: ‘O Tzw Paradoksie Konfirmacji’, In Fragmenty Filozoficxne (Fest- 
schrift for T. Kotarbinski). 

Lawson, T. [1985]: “The Context of Prediction (and the Paradox of Confirmation), British 
Journal for the Philosophy of Science, 36, pp. 393-407. 

MUSGRAVE, A. E. [1974]: “Logical versus Historical Theories of Confirmation’, British Journal 
for the Philosophy of Science, 25, pp. 1-23- 

MUSGRAVE, A. E. [1975]: ‘Popper and “Diminishing Returns from Repeated Tests” ’, Austra- 
lasian Journal of Philosophy, 53, pp. 248-53. 

O’Hear, A. [1975]: ‘Rationality of Action and Theory-Testing in Popper’, Mind, 84, pp. 273- 
8 


3. 

Popper, K. R. [1959]: The Logic of Scientific Discovery. Hutchinson. 

POPPER, K. R. [1963]: Conjectures and Refutations. Routledge. 

WATKINS, J. [1960]: ‘Confirmation Without Background Knowledge’, British Journal for the 
Philosophy of Science, xo, pp. 318~20. 

WATKINS, J. [1964]: ‘Confirmation, the Paradoxes, the Positivism’, in Bunge, M. (ed.), The 
Critical Approach to Science and Philosophy, pp. 92-115. 

WATKINS, J. [1984]: Science and Scepticism. Princeton University Press and Hutchinson. 

WORRALL, J. [1978]: ‘The Ways in which the Methodology of Scientific Research Programmes 
Improves on Popper’s Methodology’, in G. Radnitzky and G. Andersson (eds.), Progress 
and Rationality in Science, pp. 45—70. 

ZAHAR, E. G. [1973]: ‘Why Did Einstein’s Programme Supersede Lorentz’s?’, British Journal 
Jor the Philosophy of Science, 24, pp. 95-123 and 223-62. 


Brit. F. Phil. Sct. 38 (1987), 573-577 Printed in Great Britain 573 


Reviews 


NEWTON-SMITH, W. H. [1985]: Logic: An Introductory Course. Routledge 
and Kegan Paul. Pp. ix+230. £5.95. 


Logic: An Introductory Course (with the supplement of a computer teaching 
programme) by W. H. Newton-Smith is, as the title suggests, intended as 
a text book to accompany a lecture course in logic for first-year under- 
graduates who do not come from a mathematical background. It is perhaps 
destined to become the latest in a short line of set texts at Oxford University, 
the present incumbent being Logic by Wilfrid Hodges, whose predecessor 
was Beginning Logic by E. J. Lemmon. The latter (written by a philosopher) 
adopts a formal approach and, in very stark style, presents the propositional 
and predicate calculi as mathematical structures; the former, on the other 
hand (written by a mathematician), de-emphasises the role of an artificial 
language and concentrates on the analysis of English sentences. 

Newton-Smith’s book is firmly in the tradition of Lemmon’s and, in fact, 
admits in the preface to being influenced by it. Actually, the choice of 
material and its order of presentation in the two books is very similar 
indeed; the difference is that Newton-Smith assumes a much more chatty 
style that makes the content of his book far more accessible to a readership 
of non-mathematicians. Thus Newton-Smith combines positive aspects of 
both of the other two books: the content of Lemmon and the form of 
Hodges, a concise and streamlined description of propositional and first- 
order predicate logic presented in a manner that is not too daunting for 
readers unused to formal systems. 

As is inevitable with books on logic at this level, the first half covers the 
propositional calculus and the second deals with the predicate calculus; 
there is also a last word on Challenges and Limitations. Newton-Smith opens 
with a discussion of what logic is all about, defining carefully what is meant 
by a valid argument. This gentle introduction should be reassuring to the 
reluctant logician—no baptism of fire here. Next he adopts the semantic 
approach to propositional logic, adducing the usual logical connectives and 
their truth-tables one by one; in particular, he gives a lengthy treatment of 
the only really problematic connective, the conditional, and the thorny 
paradoxes of material implication. Given these tools, Newton-Smith 
explains how to use truth-tables to test the validity of an argument. In the 
third chapter of his book, having dealt with semantics, he unfolds the syntax 
of the propositional calculus. At this point his account diverges sharply 
from Hodges’ because Newton-Smith, like Lemmon, employs Gentzen’s 


574 The British Journal for the Philosophy of Science 


rules for natural deduction, whereas Hodges favours the tableau method. 
The elaboration of the ten rules is unavoidably somewhat tedious but it is, 
to be fair, generously peppered with examples. 

Having admirably distinguished between and developed the syntax and 
semantics of the propositional calculus, Newton-Smith presents an elemen- 
tary meta-theory for it, which demonstrates that the two approaches are 
actually equivalent. First, he defines recursively the set of well-formed 
formulas of the propositional language and discusses the expressive 
adequacy of the set of logical connectives that are used to form it. Then he 
forges on to a proof of the soundness of the propositional calculus: the proof 
is by induction, a case by case demonstration that each of the rules of natural 
deduction is truth-preserving—although the analysis of several of the rules 
is left as the inevitable ‘exercise for the reader’. The piéce de résistance of 
this section—indeed of the whole book—is a proof of the completeness of 
the propositional calculus. It is a rather sophisticated proof, chosen because 
it can most easily be generalised to predicate logic, which employs the 
notions of a maximally consistent set of formulas and a model for a set of 
formulas. 

After this neat treatment of propositional logic, Newton-Smith turns 
to the predicate calculus, opening with a discussion of predication and 
quantification. He gives a very clear explanation of how to symbolise many 
different types of sentences, and also extends the natural deduction rules 
to incorporate the universal and existential quantifiers. Newton-Smith also 
describes how to symbolise sentences expressing identity, using Russell’s 
theory of descriptions, and sentences referring to at least, at most and 
exactly a certain number of objects and, once again, this rather tricky topic 
is handled in a pleasingly clear way. There is, moreover, a short discussion 
of some of the philosophical problems encountered in connection with 
definite descriptions. The section on the predicate calculus concludes with 
the standard treatment of the theory of relations and logical scope, and with 
a discussion of the semantics of the predicate language that hinges on the 
recursive notion of the satisfaction of a formula; the point is made clearly 
that while truth-tables can be used to establish quite mechanically whether 
or not a sequent of the propositional calculus is correct or not, no such 
procedure exists for predicate logic. Finally, Newton-Smith mentions some 
of the problems that arise in developing an artificial language, including 
some remarks on intuitionistic logic. 

All in all, Logic: An Introductory Course is an admirable little book (albeit 
occasionally somewhat untidy). It presents the standard subject matter in 
a manner that is lucid, but not so stark as to be intimidating to the non- 
mathematician; moreover, it addresses some of the philosophical problems 
of the area. A common shortcoming of logic courses is that the student 
fails to perceive the distinction between the syntactic and semantic 
approaches and is then, unsurprisingly enough, unimpressed by—the crux 
of the matter—the soundness and completeness results that demonstrate 


Reviews 575 


the two approaches to be equivalent. However, Newton-Smith avoids this 
pitfall by clearly signposting the contours of the topic. Moreover, the 
quantifiers of the predicate calculus are introduced painlessly, and many 
examples of their use in translating from the natural language into the 
artificial one are given, peopled largely by Icabod the Balliol student and 
Reagan the American president. One cavil that must be voiced is that no 
answers to the exercises are given: what is the point of asking the question 
without supplying the answer? But at least Newton-Smith is aware that 
readers of logic books, and other people, come in two genders, and he uses 
the full richness of the English language accordingly. 


CAROLINE DUNMORE 
Wolfson College, Oxford 


NERSESSIAN, NANCY J. [1984]: Faraday to Einstein: Constructing Meaning 
in Scientific Theories. Martinus Nijhoff. Pp. xiv + 196 (ISBN 90-247- 
2997-1). 


This is supposed to be a book about the meaning of scientific terms. 
Beginning with a critical review of philosophical theories of meaning and 
meaning change in science, continuing with a historical case study of con- 
ceptual development, it builds toward a new proposal which is at once to 
answer philosophical concerns and to receive both impetus and warrant 
from scientific practice. The philosophical and historical scenes are mas- 
terfully set, but the action falls flat. The reader will be grateful for an 
anticipation that propels him through parts I and II , despite its conversion 
into frustration in part III. 

Part I is a delight. Its breezy, charming style makes fast work of major 
positions from Carnap to Kripke, giving a persuasive rational recon- 
struction of their historical connections. The upshot is that no extant theory 
of meaning does justice to science because all are grounded in a priori 
philosophising about language. Reductionist theories ignore the growth of 
meaning and depend on untenable dichotomies. The “network view” 
rejects the dichotomies and accommodates growth, but at the cost of mean- 
ing variance which generates the roadblock of incommensurability. The 
casual theory avoids incommensurability by introducing an allegedly unten- 
able essentialism. 

Part IT is a fine study, grounded in primary literature, of the evolution 
of the electromagnetic field concept. Admittedly limited to highlights, this 
study graphically captures the conceptions of key figures and the rationales 
for their differences. We follow the concept from Faraday’s inchoate 
attempt at making sense of continuous action to Einstein’s “independent 


576 The British Journal for the Philosophy of Science 


reality”. There are a few revelations, such as Lorentz’s consideration of 
physical interpretations other than contraction of the length ratio between 
moving and aether-based systems. But little is new or revisionist, and few 
comparisons with other interpretations are offered. The decision to avoid 
symbolic notion complicates the exposition. We may fault some details (a 
confusion of the scientific doctrine of the relativity of simultaneity with the 
philosophical doctrine of conventionality of simultaneity, for example). But 
the scholarship is first rate, and there is the same unerring sense as in part 
I of what points are most deserving of emphasis. 

The philosophical problem that the proposal of part III is supposed 
to solve is incommensurability: As meaning changes with theory, uses of 
a term in different theories do not express the same concept. Accordingly, 
theories cannot sustain the logical relations presupposed by attempts at 
comparative evaluation. The solution is that uses of a term in different 
theories are commensurable stages in the growth of a single concept if 
connected by chains of reasoning; that is, if there are scientific reasons for 
the changes of theory reflected in differing uses of the term. That’s it. The 
big breakthrough, evidently, is that we are to look at actual science and 
discern rationality in its transitions, whereas the philosophical mistake that 
gave us incommensurability, committed alike by logical empiricism and 
the causal theory of reference, was to focus exclusively on language. 

This “breakthrough”, by now second nature to an entire generation of 
philosophers of science, is at most a methodology; it is not a philosophical 
theory, and it solves no philosophical problems. Indeed, in Nersessian’s 
application it exacerbates problems. For it is almost an article of faith of this 
methodology that every substantive change in theory has an autonomously 
scientific rationale, that chain-of-reasoning linkages ultimately connect, 
albeit indirectly, everything that happens with everything else that happens 
in science. And if that is so, their mere presence can hardly serve to 
distinguish conceptual growth from conceptual diversity. Whereas incom- 
mensurabilists err on the side of diversity, holding at the extreme that 
every theory and every individual scientist is insulated from every other, 
Nersessian errs on the side of homogeneity, suggesting what amounts to a 
sweeping, indiscriminate monism. 

The one claim of part III that offers any hope of discrimination is that 
the theoreticians responsible for different stages in the growth of the single 
concept ‘field’ were ‘‘all reasoning about the same basic problem” (p. 155); 
perhaps concepts are to be individuated by problems. But apart from the 
implausibility of supposing that a single problem cannot generate diverse 
conceptual responses, the history of part IT belies the claim. There we trace 
the, changes in the problems and interests responsible for the differing 
approaches and conceptions of these theoreticians. Of course, these prob- 
lems and interests are “‘related’’. But at a sufficiently high level of abstrac- 
tion, everything is related. We.are given no criterion for the individuation 
of problems, and none for deciding whether or not the conceptions pro- 


Reviews 577 


duced by reasoning about the same problem are phases in the growth of a 
single concept. 

Chain-of-reasoning reconstructions may well assuage worries over 
incommensurability in'specific cases, but they hardly constitute a theory of 
meaning for science. 


JARRETT LEPLIN 
University of North Carolina, Greensboro 


